Site Reliability Engineer

Sorry, this job was removed at 2:39 p.m. (CST) on Tuesday, November 19, 2019
Find out who's hiring in Northwest Suburbs.
See all Developer + Engineer jobs in Northwest Suburbs
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

The Site Reliability Engineer will be a core part of the Site Reliability & Operations team within our Technology organization. This role will be responsible for defining the future state of our monitoring environment and the key integrations between our monitoring tools, event management, and IT Service Management tool stack. This key position will act as a go-to expert for application performance management and infrastructure monitoring across the product teams in the organization.

This position requires exceptional communication skills, a commitment to exceptional results and a passion for

continuous improvement.

Are you the teammate we are looking for?

Who you are:

  • A senior technologist with a strong background in enterprise level monitoring solutions and their deployment in a large scale environment (~5000 hosts)
  • Passionate about continuous improvement in performance & availability in an environment through efficient alerting and routing
  • Proven experience in architecting integrations between different layers of a monitoring tool stack and defining the interaction of data between source monitoring tools, event management, and ITSM
  • Ability to translate business knowledge embedded in our product teams into transactions and key events that will be tracked as part of our reference architecture

How we work:

  • Small, self-sufficient product-oriented teams with an entrepreneurial spirit organized into categories
  • Dedicated Tech Delivery and Enablement experts committed to cutting-edge infrastructure and developer tools
  • Casual, collaborative, agile environment which embraces and operates under our shared principles
  • Complete transparency with open, honest discussions about our progress
  • Close working relationship between executive stakeholders, product teams, and operational focused teams

What we offer:

  • Lean enabling process that focuses on putting our product teams in the best position to succeed
  • A commitment to investing in our products, hiring the best talent, and giving them the chance to meaningfully contribute to a vast market opportunity
  • A subscription to an Online Training Forum for all technology colleagues

What you bring:

  • Minimum 5-7 years of experience in deploying or maintaining enterprise monitoring tools for Application Performance Management (e.g. AppDynamics, New Relic), Infrastructure Monitoring (e.g. Solarwinds, SCOM) and translating the resulting alerts into notifications and escalations in a mixed SaaS and OnPremise environment
  • Ability to effectively communicate details of complex issues to stakeholders, business and technical users
  • Analytical skills, with the ability to identify themes within data and make data driven decisions
  • Provide leadership around defining our key events / alerts in a reference architecture and deploying this architecture across an environment of diverse technology assets
  • Implement a standard way of rolling out monitoring agents to a diverse set of target end-point profiles using deployment automation tools (e.g. Octopus)
  • Maintain agent versioning to ensure stability of the monitoring environment and communicate out any potential gaps in coverage
  • Demonstrated high-level understanding of enterprise software and networking concepts including SaaS technologies, and SDLC
  • Create self-service models for consumption of monitoring toolsets where applicable
  • Knowledge and experience developing and using SOAP, RPC, and REST APIs
  • Working knowledge of scripting languages such as PowerShell for use in workflow orchestration
  • Perform high level assessment of business, functional, and technical requirements to address monitoring gaps and implementation of new monitoring products

During the last three months, you would have:

  • Lead broad programs in the deployment and configuration of monitoring agents to a variety of application and database hosts
  • Architected the integration of alerting out of the monitoring toolset into event management and ITIL
  • Defined and implemented escalation patterns for different severities of alerts into a paging tool for incident management
  • Identified and documented ongoing monitoring training requirements and created a communication plan
  • Created easily consumable data around application performance and availability to leadership and external stakeholders
  • Presented material around future state developments in proactive monitoring to internal stakeholders and leadership
Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

Our office has modern workspaces, a cafe, and a gym. But since we're a talent-anywhere company, you may find our team members all over Chicagoland.

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about PaylocityFind similar jobs