Site Reliability Engineering Team Lead
Site Reliability Engineering Team Lead
- Tracking Code
- 1542-626
- Job Description
Site Reliability Engineering Team Lead
Reports to the Site Reliability Engineering Manager
1st Shift: 9:00 AM - 6:00 PM CST (Monday-Friday)Our Site Reliability Engineering Team:
Is uniquely positioned as a gatekeeper of Enova’s business-critical systems and acts as the technology operations backbone of the company – ensuring all systems and business-critical processes are functioning correctly for successful enterprise operations.
Our Site Reliability Engineers provide automated solutions to business teams and work closely with every department to resolve issues, assist with root-cause analysis, and implement fixes for workaround requests. They also integrate new application-related utilities to meet business requirements, while continuously looking for ways to improve overall system performance.
We speak this language:
Customer service, efficiency, initiative, automation, ownership, growth, and dedication
This is where YOU come in:
As the Site Reliability Engineering Team Lead, you will:
- Monitor team critical metrics and ensure SLA observance, ticket quality and process adherence
- Keep a pulse on all daily operational work (bank file monitoring, batch performance, ticket processing, alert response, etc.) ensuring all stakeholder requests are responded to, and delegating and assigning work to SRE team members as appropriate
- Handle shift logistics for the team (holiday coverage, on-call schedules, time off requests, etc.)
- Provide coaching, training and mentorship to junior members of the team
- Identify trends and areas where the team can improve while implementing best practices
- Work closely with the SRE Manager to monitor team morale and ensure team members are productive, engaged and motivated
- Ensure Major Incidents are handled properly, processes are adhered to and proper controls are in place; run Major Incidents when needed
- Run strategic initiatives and projects within the SRE team
Learn and be able to perform the day-to-day responsibilities of the SRE team, including:
- Identifying and automating manual workarounds and process improvements
- Processing customer service requests and meeting established SLAs
- Understanding the loan life cycle, transactional logic, and framework of our finance models
- Monitoring the availability, latency, scalability and efficiency of all services
- Utilizing Rails console to resolve customer account issues
- Updating knowledge base articles with current information and communicating to the team
- Performing periodic on-call duty as part of the SRE team
You’re right for this job if you:
- have career aspirations of becoming an outstanding people manager for the Site Reliability team
- know how to successfully manage complex projects simultaneously as well as coach and oversee the projects of others
- enjoy leading a team while driving strategic efforts; are resourceful, detail-oriented and accurate
- are looking to be mentored and coached in order to become an effective people manager
- are a good decision-maker, keeping the big picture in mind without losing track of the details
- know when to push back; are able to manage expectations (up/down) and think outside the box
- have a constant drive to improve team/customer experience: "how can I make things better?"
- know how to get around in complex environments; are able to learn new technologies and concepts quickly and build upon that knowledge constantly
- have the ability to balance and prioritize multiple projects/tasks and remain calm under pressure; are self-motivated and self-directed
- have experience in Ruby, Perl, Python, Java, or PHP
- are able to write accurate and efficient SQL queries
- have a solid understanding of IT infrastructure (Linux or systems administration, network technologies, relational databases, web technologies, etc.)
Kudos to you if you:- have previous experience successfully leading teams
- have a Bachelor’s degree (Computer Science, Engineering, or Information Technology is a plus)
- have previous e-commerce experience
- have worked with the following monitoring and log engines (Zenoss, Nagios, New Relic, Splunk)
- have experience with graphing/stats engines like Graphite, Mathematica or Cacti
- are familiar with installing and configuring open source software (Debian, Postgres) and have an understanding of virtualization platforms like OpenStack, Amazon AWS, VMWare and Xen
- Job Location
- Chicago, Illinois, United States
- Position Type
- Full-Time/Regular