HPC Systems Engineer
DRW is a technology-driven, diversified principal trading firm. We trade our own capital at our own risk, across a broad range of asset classes, instruments and strategies, in financial markets around the world. As the markets have evolved over the past 25 years, so has DRW – maximizing opportunities to include real estate, cryptoassets and venture capital. With over 900 employees at our Chicago headquarters and offices around the world, we work together to solve complex problems, challenge consensus and deliver meaningful results. It’s a place of high expectations, deep curiosity and thoughtful collaboration.
We are looking for an HPC Systems Engineer for our Chicago office. This role will be responsible for all aspects of our HPC cluster that drives the research and development of our trading systems. You will monitor the health and utilization of the environment, detect and prevent problems, ensure high availability, and enable future growth and innovation in our research infrastructure. Your work will encompass everything from integration of the newest technologies for high performance computing to rapid development and deployment of custom trading tools.
What you will do:
- Manage a large compute cluster of Linux servers and related hardware
- Administrate high performance storage and network components in our infrastructure
- Develop monitors for job performance, systems stats, and the general health of the infrastructure
- Architect upgrades to expand the size, scope, and performance of our cluster and continually integrate the latest technologies
- Coordinate with a multiple teams for the deployment, operation, and maintenance of our data center footprint
- Assist in optimizing and tuning grid jobs to efficiently utilize compute resources
What you will need:
- Experience with Linux administration (Ubuntu or Debian preferred)
- Confidence with configuration management tools like Chef, Puppet, or Ansible
- Experience in administration of clusters of homogeneous machines
- Knowledge and comfort with hands on development in scripting languages like Bash, Python, and Ruby
- Familiarity with container and orchestration tools like Docker and Kubernetes
- Competence in networking fundamentals
- Skill with Linux package management tools like apt/dpkg or yum/rpm
- At home with compilation and packaging of open source software like Redis, Python, or Ruby; including reading and modifying Makefiles
- Experience with parallel or distributed file systems is benefit