Data Science Intern, Applied Machine Learning
With the advent of genomic sequencing, we can finally decode and process our genetic makeup. We now have more data than ever before but providers don't have the infrastructure or expertise to make sense of this data. Here at Tempus, we believe the greatest promise for the detection and treatment of cancer lies in the deep understanding of molecular activity for disease initiation, progression, and efficacious treatment based on the discovery of unique biomarkers.
We're on a mission to connect an entire ecosystem to redefine how patient data is used in clinical settings. Our Data Science team is passionate and focused on applying state of the art machine learning techniques to the processing and analysis of vast amounts of clinical, molecular, and imaging data.
What You’ll Do
- Collaborate with product, science, engineering, and business development teams to build the most advanced data platform in precision medicine
- Design and prototype novel data visualization and analysis tools and algorithms
- Wrangle and analyze large diverse sparse datasets, extract insights, and drive further research opportunities
- Interrogate analytical results for robustness, validity, and out of sample stability
- Document, summarize, and present your findings to a group of peers and stakeholders
Required Qualifications:
- Degree in computer science, software engineering, statistics, machine learning, bioinformatics or related technical field
- Project experience building and validating predictive models on structured or unstructured data
- Proficient in Python, and SQL
- Experience with the following: Pandas, NumPy, SciPy, Scikit-learn, Jupyter Notebooks
- Experience with supervised and unsupervised machine learning algorithms, and ensemble methods, such as: K-Means, PCA, Regression, Neural Networks, Decision Trees, Gradient Boosting
- Experience working in a Linux / Mac environment
Preferred Qualifications
- 2+ years full time employment experience building and validating predictive models on structured or unstructured data
- Kaggle.com competitions and/or kernels track record
- Experience working with clinical and/or genomic data
- Experience with AWS architecture
- Experience with: Git, matplotlib, seaborn, HTML5, CSS3, JavaScript, D3, Plot.ly, Flask, Dask
- Experience in agile environments and comfort with quick iterations