Site Reliability Engineer
We are seeking a bright, energetic, and highly motivated Site Reliability Engineer to join our technology team. If you’re a Systems Engineer who loves automation, or a Software Developer who loves infrastructure, this job is for you! The Site Reliability Engineer will be an integral part of our IT Operations team, and this individual should have a passion for technology, open source software, and learning from senior members of the team.
Responsibilities
- Help build tools and systems that assist GoHealth's Engineering, QA, and Operations team, to deliver high-quality software
- Work with development teams and leadership to help evolve our continuous delivery process
- Monitor and respond to alerts from technology infrastructure in order to ensure proper SLA's are met
- Work throughout the technology stack to assist senior team members in the design, build, and monitor solutions that allow for continued scalability
- Document work associated with troubleshooting, while utilizing problem solving skills from the early stages of design all the way through identifying production issues.
- Remain flexible, and exude a strong sense of ownership of uptime and system performance
- Experienced in programming and developing software or a strong interest in learning
Qualifications
- BS in Computer Science (or equivalent experience) and minimum of 2 years of overall experience including experience with open source technologies, automated configuration, DevOps, or cloud automation development
- Experienced in performing systems administration and managing a Linux environment (RHEL, CentOS)
- Familiar with one or more scripting languages (bash, Python, Perl, Ruby or similar)
- Exposure to configuration management tools such as Puppet or Chef
- Experience doing software development with a wide variety of open source technologies to scale, automate, and monitor or a strong interest in learning
- Experience supporting and troubleshooting for the following technology components (or similar) would be considered a plus: Docker/Orgin/Kubernetes, Jenkins, Nagios/ElasticSearch/Splunk, Apache/Tomcat/Nginx, MySQL/Couchbase, LDAP and Kerberos
- Hands on experience with Nagios and creation of custom scripts to monitor all aspects of application infrastructure a plus
- Bonus points for individuals with working knowledge of building and maintaining IP-based networks (Cisco) and F5 LTM, DNS and/or iRule development experience
- Experience with VMWare/vCenter, or Openstack is preferred, but not required
- Experience managing a 24/7 SaaS infrastructure
- Detail oriented individual, with strong communication skills used to collaborate and keep others informed
- Ability to produce results while working independently, and in a team environment