Big Data Engineer
Passionate about making a difference in the world of cancer genomics?
With the advent of genomic sequencing, we can finally understand our genetic makeup. We now have more data than ever before but providers don't have the infrastructure or expertise to make sense of said data. Here at Tempus, we are building the infrastructure to modernize cancer treatment. By analyzing a patient’s genetic data in the context of molecular therapies, We empower physicians to make real-time data-driven decisions in clinic based on the comprehensive computational analysis of a patient’s unique pathology. We're looking for engineers who are passionate about changing the status quo and bringing cancer care into the 21st century.
What You'll Do:
- Partner with product managers, software engineers, and data scientists to design, document, and implement relational and dimensional data models.
- Implement and maintain robust ETL/ELT to populate an EDW and data marts.
- Collaborate with Data Scientists to operationalize data processing models.
- Inform the data platform roadmap by distilling cloud technology options into an opinionated recommendation.
- Integrate source data quality measures into the EDW, and work with data owners to identify actions plans to improve.
Qualifications:
Must Have:
- Experience implementing solutions with at least one commercial / open-source ETL platform to load a multi-terabyte data warehouse.
- Excellent SQL skills and proficiency in query performance tuning.
- Ability to articulate use cases for open-source / cloud vendor big data technologies.
- Experience parsing semi-structured (JSON, XML, EDI) and store in a tabular data model.
- Experience writing data parsing/transformation in Python.
- Experience in agile environments and comfort with quick iterations.
- Prior experience as a role with production responsibility (database / system administrator).
- Demonstrated success in influencing solutions without formal authority.
Great if you have:
- Experience with enterprise workload automation / schedulers.
- Healthcare domain knowledge and experience with healthcare transmission formats and data models.
- Experience consuming data from NoSQL database systems.
- Experience in configuration and management tools such as Terraform, Puppet, Chef, Ansible, Salt or AWS Cloud Formation.