SpotHero is seeking an Senior Data Engineer to join it’s Data Engineering squad. This squad interacts with a number of data consumers such as Data Science, Marketing, Engineering and our Business Analyst team to provide data platform solutions that meet their day-to-day needs and long term vision.
This Senior Data Engineer role is one focused heavily on backend application development with a focus on designing pipelines, modeling data that is being piped, ensuring the timeliness and quality of data. In addition to the above, this individual will be expected to instantiate, observe and maintain infrastructure, both AWS managed and open source solutions.
Who we are:
SpotHero is one of transportation's hottest tech companies! We’re rapidly growing with the mission of bringing the parking industry into the future through technology. Drivers across the nation use the SpotHero mobile app or website to reserve convenient, affordable parking on-the-go or in advance, and parking companies rely on us to help them reach new customers while optimizing their business. We connect the dots with cutting-edge technology, delivering value to both sides of this exciting, evolving marketplace.
What will you do:
- Work with our analytics, marketing and data science teams to understand our data processing needs.
- Be a key hands-on contributor to the design and implementation of our data platform solutions from the infrastructure layer up to the API.
- Model and architect our data in a way that will scale with the increasingly complex ways we’re analyzing it.
- Build robust pipelines that make sure data is where it needs to be, when it needs to be there.
- Build frameworks and tools to help our software engineers, data analysts, and data scientists design and build their own data pipelines in a self-service manner.
- Performance testing and engineering to ensure that our systems always scale to meet our needs.
- Be a key member of the team focused on pure hands-on contribution to the implementation and operation of our data platform.
- Data Modeling/Architecting
- Design data models with a broader understanding of underlying systems.
- Identification of implementation of appropriate abstractions for immediate requirements.
- Build performant models that are consistent with accompanying documentation that are built with quality in mind.
- Consult with stakeholders on the best practices for creation and deployment of data models and data flows.
- Definition and enforcing of service level agreements between products owned and stakeholders, including configuration of monitoring and alerting
- Good understanding of Data lineage and dependencies between data pipelines.
- You are able to maintain existing ETLs and develop simple ETL processes from scratch. You are able to meet the requirements laid out for you. You start to see the bigger picture.
- Working with Data Processing Frameworks
- Capable of determining the best architecture, batch or streaming, for applications being built.
- Evaluation of various frameworks and documentation of pros/cons for a wide audience.
- Working with Infrastructure
- Proficient at provisioning new infrastructure across environments.
- Capable of managing/integration autoscaling, logging, monitoring and alerting for the system. Your infrastructure as code is environment agnostic.
- You have at least 7 years of hands-on experience as an Engineer across multiple environments on complex distributed polyglot systems using Java, Scala, Clojure, Python, Go and/or C++.
- Strong SQL skills and data modeling experience.
- You can go up and down the stack from deep in the infrastructure layer all the way up to the client libraries.
- Deep understanding of object-oriented and/or functional programming patterns and paradigms.
- Hands on experience with multiple data platforms and tools (eg. S3, Redshift, Airflow, Spark, Presto, Hive).
- Minimum of 2-3 years of stakeholder management/enablement experience that cut across multiple squads.
- Passion for ensuring timeliness, availability and quality of our highest value data-sets that meets established SLOs.
- Ability to provide support for pieces of codebase owned and also understand the codebase with normal direction from peers and data engineers.
- Demonstrated experience with small teams that move fast - all members are expected to be able to achieve maximum results with minimal direction.
- Demonstrated experience measuring the impact of technical products across multiple domains through experimentation and statistical analysis.
- Strong ability to communicate on both business and technical subjects.
Nice to Haves:
- Kubernetes and/or Docker experience.
- Message driven or streaming architectures, such as those with Kafka, Spark, Flink.
- Postgres, MySql, or other RDBMS experience.
- AWS, GCP and/or Azure experience.
- Redshift, Presto, or other MPP database experience.
- Cassandra, Elastic, Redis and/or Couchbase experience.
- Airflow, Luigi, or other ETL scheduling tool experience.
- Open source contributions to a few major projects.
Technology we use:
- Our Android Stack is: Kotlin and XML (standard for Android apps) using MVI architecture (still working on refactoring old views), our database layer is built in Realm. Bitrise for CI/CD. We also make heavy use of Dagger, RxJava, Espresso (testing). Network stack uses Retrofit.
- Our iOS Stack is: Swift using MVC architecture, CoreData for Local Storage, XCUI for UI Testing, XCTest for Unit testing, SPM for Package Management, Fastlane for app automation and build scripts, Bitrise for CI/CD, and Sentry for crash reporting.
- Our Back End Stack is: Monolith using Django/Python/PostgreSQL. We are moving our Monolith to a Modular Monolith, using Domain Driven Design. When relevant we extract specific domains to Services currently using Java, Kotlin and Go. We also use Docker, deploy our apps via Kubernetes. We use Kafka for asynchronous-, and gRPC for synchronous service-to-service communication. Our Integrations are on a .Net CORE, moving to Kotlin.
- Our Front End Stack is: Our Front End stack is React/Redux, Sass, Jest/React Testing Library/Cypress, and Webpack. We maintain a private npm repository with shareable UI components, utility functions, Babel/ESLint/Prettier configurations, and custom tasks
- Our Data Stack is: Our Monolith Database is Postgres and Redis for caching. We also use Redshift as our data warehouse and S3 as our data lake. The data lake is queried using Presto. We use Airflow and Spark for ETL, as well as do some stream processing (Kafka Streams and Spark at the moment). Our Model pipeline uses scikit-learn, pandas. Our analysts utilize Looker as our Business Intelligence tool. And we use Quicksight for Dashboard on our external Data Products.
- Our Dev Tools Stack is: AWS+Kubernetes for hosting. Terraform + Helm Charts for IaaS/Deployment. ConcourseCI for CI/CD. Prometheus/Alertmanager/VictorOps for team alerting. We’re starting to work on multi-region available services.
What we are offering:
- Career game changer – A truly unique experience to work for a fast-growing startup in a role with unlimited potential for growth.
- Excellent benefits –
- Flexible PTO policy and great work/life balance – We value and support each individual team member.
- Annual parking stipend – we help people park!
- The opportunity to collaborate with fun, innovative, and passionate people in a casual, yet highly productive atmosphere.
- A workplace recognized as the Best Consumer Web Company by Built in Chicago, Top Company Culture by Entrepreneur, a Top Workplace by Chicago Tribune, and one of Chicago’s Best Places to Work for Women Under 35 by Crain’s Chicago Business.
Steps to apply: Please include any GitHub account, LinkedIn profile, and any project that you’re particularly proud of. We love seeing work that others loved working on.
SpotHero is an equal opportunity employer. We know that a diverse workforce is the strongest workforce, and are committed to building and supporting an inclusive environment for all.