Site Reliability Engineer
5 hours ago
Reliability Engineering at Collective health is a discipline combining software and systems engineering skills. We exercise this to build and run secure fault-tolerant distributed applications that exceed the challenges of the healthcare space. We extend and apply modern systems, software, architecture and development practices to give our customers a more reliable overall healthcare management experience. Collective Health Reliability engineers ensure that our internal and externally visible critical services are always there when our users need them. We do this by delivering on uptime guarantees, and managing capacity and performance.
The SRE mindset enables us to deliver better running production applications quickly and efficiently. We develop a broad understanding of how our systems relate to one another in order to use our abilities to engineer solutions to hard operational problems. Our focus is on automating away operational effort and complexity, blameless postmortems, and shifting from reactive responses to outages to proactive identification and mitigation of operational risks. These practices allow us to iteratively achieve highly reliable services that garner user loyalty and trust, but also keep our work interesting and dynamic day to day.