Systems Reliability Engineer - Monitoring & Escalation at DRW
DRW is a diversified, technology-led principal trading firm. We trade our own capital at our own risk, across a broad range of asset classes, instruments and strategies, in markets around the world. As the markets have evolved over the past 25 years, so has DRW – growing to include real estate, cryptocurrencies, venture capital and several industry acquisitions. With more than 1000 employees globally, we work together to solve interesting problems and capture opportunities. It’s a place of high expectations, deep curiosity, and constant collaboration, with some of the smartest, most passionate people you will meet.
As a Systems Reliability Engineer, you will have a ground floor opportunity to shape the overall direction of DRW’s Technology Operations Center, a new team acting as the global, front-line of defense in monitoring and response for our business-critical systems. As a part of this team, you will rapidly respond to events and outages, ensuring a consistent and thorough response globally, while working to minimize any potential business impact. You will also lead efforts to further enhance the Technology Operations Center’s capabilities through automation and the buildout of new tool sets and processes. You will be uniquely positioned to interact with individuals and teams across all areas of DRW, requiring you to quickly build a working understanding of the relationships between systems and the organizations dependent on them.
You will be right at home if you:
- Have experience working on a globally deployed team providing around the clock monitoring and troubleshooting of proprietary and third party systems
- Are well versed in coordinating efforts with other global support teams to ensure rapid response and escalation, continuity, and consistent support
- Are a seasoned technologist with a broad range of knowledge and experience deploying and supporting proprietary trading technologies
- Have exceptional communication and collaboration skills and the ability to quickly build trust and credibility with a variety of trading, development, infrastructure and operations teams
- Have experience implementing process automation and monitoring tools using a combination of in-house and third-party technologies
- Understand the implications of coordinating planned changes in a global technology ecosystem and have first-hand experience successfully working within a formal Change Management program
- Can build clear, concise documentation for common procedures, troubleshooting scenarios and system architecture
What’s needed in this role:
- Expert troubleshooting and problem analysis skills, spanning multiple technology stacks and high-performance applications deployed across a global footprint
- Experience with monitoring tools and process automation
- Extensive Linux and/or Windows support and engineering experience in a high availability corporate environment; RHEL, Ubuntu or Debian experience is preferred
- Scripting and automation experience, preferably with Python
- Experience working with configuration management tools
- Must have excellent verbal and written communication skills
- Bachelor’s degree in Computer Science or equivalent preferred
For more information about DRW's processing activities and our use of job applicants' data, please view our Privacy Notice at https://drw.com/privacy-notice.
California residents, please review the California Privacy Notice for information about certain legal rights at https://drw.com/california-privacy-notice.