Senior Machine Learning Infrastructure Engineer
Overview
At Ascent we are building an intelligent compliance platform that enables compliance professionals to easily track and understand their compliance obligations and related regulation. To support that platform, we are also building a data and machine-learning back-end; this back-end will house, coordinate, and route all of our data pipelines and machine learning models.
We are looking for experienced, passionate infrastructure engineers to help us build and maintain that platform. As a Senior Machine Learning Infrastructure Engineer at Ascent, you will be working closely with other Engineers and Data Scientists and the broader tech team to design and build infrastructure that a) ingests and handles data (e.g. regulations, customer data, machine learning features, etc); b) deploys and coordinates data microservices such as machine learning models or other transformations; c) facilitates workflows across these microservices and data layers; and d) tests, monitors, and reports on itself. Given the increased responsibility at the Senior level, you will play a strong role in designing this infrastructure, thinking about the long-term implications of design decisions, as well as hiring and mentoring more junior engineers.
We use both hosted solutions and open source tools. We have a strong bias towards containerization and internal transparency. We also place a high premium on our culture and values, both within the tech team and the company as a whole. We believe a diversity of opinions and perspectives creates a stronger team and product, and we are committed to an equal opportunity hiring process.
Responsibilities
- Design and build infrastructure to host machine learning models as microservices using modern conventions and coding practices
- Help design and implement data models and database layers that support our data science activities
- Use creativity and independent thinking to solve technical problems
- Mentor more junior team members
- Communicate clearly and effectively with technical and non-technical colleagues about our data engineering projects
- Work closely with data scientists to understand their needs and processes
- Work closely with our whole technology team to successfully maintain our data platform alongside the broader technology stack
- Implement strong and consistent internal API conventions and documentation
- Implement with an emphasis on tests, maintainability, and clean coding practices to produce simple solutions and reduce technical debt
Minimum Skills and Experience
- 3+ years building and maintaining back-end services in production, preferably using container-based architectures
- 1+ years building and maintaining data science pipelines that incorporate machine learning models in a production environment behind an API
- Experience with large data sets and modern tools for handling them such as Apache Spark, Kafka, Cassandra, Mesos, etc
- Experience using AWS tools and services
- Ability to develop creative technical solutions given a set of business requirements and a strong understanding of modern data architectures
- Ability to work productively on small teams and lead workstreams independently if needed
- Experience mentoring less experienced colleagues
- Ability to communicate technical ideas to non-technical colleagues
- Proficient in SQL, *nix CLI tools (grep/sed/awk/BASH, etc), and Python
- Familiarity with Java and an understanding of the JVM ecosystem
- Experience deploying and maintaining code using git-based tools and operating in a continuous deployment/integration environment
- Experience writing thorough tests and documentation for maintainable code-bases
Preferred Skills and Experience
- 3+ years building and maintaining data science pipelines that incorporate machine learning models in a production environment behind an API
- Experience working with data scientists in production roles
- Experience with container management solutions like Kubernetes, Marathon, etc
- Experience with solutions for “fast data” architectures like DC/OS or other similar architectures and tools
- Experience storing and using large amounts of text data and text transformations