Data Scientist, Applied Machine Learning - RNA
Here at Tempus we believe the greatest promise for the detection and treatment of cancer & other diseases lies in building a deep understanding of the interaction between molecular attributes and clinical treatment. With the advent of genomic sequencing, we can finally measure and process our genetic makeup. We now have more data than ever before, but providers often don't have the infrastructure or expertise required to easily extract the valuable insights that exist within this data.
This is a unique opportunity to expand your expertise in cancer genetics and have a direct impact on the fight against cancer. We're on a mission to redefine how genomic data is used in a clinical setting through precision medicine. We are looking for data scientists who are passionate about applying state of the art techniques to the processing and integrative analyses of vast amounts of clinical, molecular, and imaging data.
We are interested in a strong machine learning or statistics specialist with a strong genomics background, that is eager to learn about the recent advances in healthcare, and applying his or her skills to improve patient outcomes.
What You'll Do
- Analyze and integrate large diverse clinical, molecular and imaging datasets to extract insights, and drive research opportunities.
- Redefine patient cohorts for clinical and research insights.
- Design and prototype novel analysis tools and algorithms.
- Collaborate with product, science, engineering, and business development teams to build the most advanced data platform in precision medicine.
- Interrogate analytical results for robustness, validity, and out of sample stability.
- Document, summarize, and present your findings to a group of peers and stakeholders.
- PhD degree in a quantitative discipline (e.g. statistical genetics, cancer genetics, bioinformatics, statistics, computational biology, applied mathematics, or similar)
- Experience with supervised and unsupervised machine learning algorithms, and ensemble methods, such as: PCA, regression, deep neural networks, decision trees, gradient boosting, generalized linear models, mixed effect models, non-linear low dimensional embeddings and clustering.
- Proficient in Python, SQL and Docker.
- Experience with the following: Pandas, NumPy, SciPy, Scikit-learn, AWS and Jupyter Notebooks.
- Experience working in a Linux / Mac and AWS cloud environments.
- Outstanding programming and problem solving skills.
- Self-driven and work well in an interdisciplinary team with minimal direction.
- Thrive in a fast-paced environment and willing to shift priorities seamlessly.
- Experience with communicating insights and presenting concepts to diverse audiences.
- Strong peer-reviewed publication record.
- 2+ years full time employment or postdoctoral experience building and validating predictive models on structured or unstructured data.
- Previous experience working with large transcriptome data sets.
- Experience working with clinical and/or genomic data.
- Kaggle.com competitions and/or kernels track record.
- Experience in agile environments and comfort with quick iterations.