Natural Language Processing Data Science Intern (Fall 2018)
Passionate about making a difference in the world of cancer genomics?
With the advent of genomic sequencing, we can finally measure and process our genetic makeup. We now have more data than ever before but providers don't have the infrastructure or expertise to make sense of said data, let alone use their extensive patient charting to complement the data achieved through genome sequencing. Here at Tempus, we believe that the wholistic approach for the detection and treatment of cancer lies in the deep understanding of molecular activity coupled with the ability to use the latest NLP and predictive modeling techniques to extract information and insights from the patient’s chart.
Our Natural Language Processing Data Scientist Interns will use state of the art techniques to process and analyze vast amounts of clinical data in a way it has never been done before. They’ll also help create a highly scalable infrastructure to house the billions of records from the ground up. We’re looking for someone who will collaborate with product, research, and business development teams to develop the most advanced data fusion platform in cancer care.
Tempus is accepting resumes for paid part-time and full-time academic semesters and summer internships (and unpaid internships for students receiving academic credits for internship).
What you'll do:
- Help design and develop a novel bioinformatics platform with the capability of ingesting large unstructured clinical data sets to separate signal from noise and provide personalized insights at the patient level
- Develop innovative methods for processing and storing data
- Interrogate analytical results to resolve algorithmic success, robustness and validity
Requirements:
- Working towards Master’s or higher degree level, or equivalent experience in statistics, computer science, bioinformatics or related field
- Experience with a variety of NLP methods for information extraction, topic modeling, parsing, and relationship extraction
- Experience with knowledge databases and language ontologies
- Quantitative training in probability, statistics and machine learning
- classical statistical tools, machine learning algorithms, ensemble methods
- Analytical development and programming skills
- Python, R, Javascript, or Lua
- Reproducible research methods
- Experience in genomics is a plus, especially experience with next-generation sequencing data processing and modeling
- Goal-oriented thinking
- Great problem solving skills
- Self-driven and works well in an interdisciplinary team with minimal direction
- Experience with communicating insights and presenting concepts to a diverse audience