Junior Data Engineer
DRIVIN is looking to expand our data team as we continue to grow our data platform. The candidate should have a strong background with Python and SQL. As a member of the data team, you will architect, plan, build, test, and deliver data solutions utilizing our AWS enterprise data platform.
DRIVIN has a polyglot data model using many cutting-edge data platforms including AWS Redshift for Data Warehouse, Elastic Search for location-based searching, and Postgres for transactional data and product delivery. Our delivery framework is comprised of Python/Docker on ECS, Spark on EMR, and Jenkins for CI/CD.
This candidate should be a self-starter who is interested in learning new systems/environments and passionate about developing quality supportable data service solutions for internal and external customers.
- 1-2 years’ experience Postgres SQL development including functions, stored procedures, and indexing or equivalent (required).
- Experience in production data management in high availability product delivery ODS / RDBMS or equivalent (required).
- Experience planning and designing maintainable data schemas (required).
- 1-2 years’ Experience with Python, Docker, and data warehouse environments (required).
- Experience using Github / Jenkins (CI/CD) / Artifactory / PyPy or comparable delivery stacks (preferred).
- Experience with Postgres, Elastic Search, AWS EMR, and AWS ECS (preferred).
- Experience with AWS Redshift, MPP, or Dynamo DB (preferred).
- Experience with Kinesis/Kafka (preferred).
- Experience working with large enterprise data lakes (preferred).
- Experience working with data analysis tools / applications such as Tableau, Qubole, or equivalent (preferred)
- Work with product, data science, analytics, and engineering teams to learn project data needs and define project scope.
- Design and planning of data services solutions on the DRIVIN DAAS Platform.
- Building and delivery of Python/Docker feed framework data pipeline jobs and services.
- Contribute to the Data Engineering team delivery framework including building of re-usable code, implementing industry best practices, and maintain a common delivery framework.
- Monitoring, maintenance, documentation, and incident resolution of scheduled production data jobs supporting internal and external customers data needs.