Site Ops Incident Manager

Sorry, this job was removed at 2:21 a.m. (CST) on Tuesday, August 22, 2017
Find out who's hiring remotely in Chicago.
See all Remote Cybersecurity + IT jobs in Chicago
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

What We Do
Uptake is a Chicago-based predictive analytics SaaS platform provider that empowers major industry leaders to optimize performance, reduce asset failures and enhance safety. At Uptake, we combine our strengths—machine learning, analytics, data visualization and software development—with the expertise of our industrial partners. The result is enormous savings in development time and resources for Uptake’s partners and a proven industrial grade software platform that delivers value to partners and their end customers.

What You’ll DoAs an Incident Manager, you’ll perform Incident Management functions critical to Uptake’s applications and infrastructure. The IM will be on our Site Operations team and responsible for leading restoration of site impacting incidents through ownership of outage bridge calls/meetings, triaging and investigation of infrastructure and application health, and orchestration of available resources to drive resolution of degraded systems as quickly as possible. A strong understanding of SaaS and infrastructure fundamentals is key for this position, as are communication skills and the ability to work both individually and across a globally diverse group of engineers and support staff.

Responsibilities:

  • Own and drive restoration and coordinates efforts for major incidents across multiple support teams
  • Incident ticket tracking, reporting, follow-ups, maintaining/updating, and making sure they are resolved within set Service Level Agreements.
  • Foster IT best practices for Incident Management including: detection, triaging, assessment, troubleshooting and restoration
  • Identify problems that address site/ infrastructure resiliency, availability/performance issues
  • Mentor colleagues and partake in onboarding new hires.
  • Problem Management including Root Cause Analysis (we use JIRA for tracking and documenting)
  • Creating, maintaining, and assigning Standard Operating Procedures to facilitate knowledge transfer.
  • Occasional Change Management (we currently use JIRA for change tickets).

Requirements

  • 6+ years experience supporting large-scale web applications and infrastructure
  • 2 to 4 years in an operational or analytical role
  • Experience as an Incident Manager, Operations Manager, or Problem Manager
  • ITIL certification – at minimum: ITIL Foundation Certificate in IT Service Management
  • Strong analytical and problem-solving skills
  • Technical background or ability to pick up technology concepts quickly
  • Familiarity with SaaS or e-commerce website architecture
  • Exposure to ITSM/ITIL processes such as change, incident, problem and capacity management
  • Documentation, reporting, and organizational skills: JIRA, JQL queries, Confluence, Excel
  • Demonstrated statistical modeling capability
  • Experience using monitoring tools such as: Grafana, Zabbix, New Relic

Preferred skills:

  • Excellent written and oral communication and interpersonal skills
  • Identify goals and work independently
  • Knowledge of core e-commerce technologies including cloud, web services and multi-tier architectures
  • Ability to define and optimize processes
  • Ability to work collaboratively in a fast-paced, entrepreneurial environment
Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

We are located in River North just right off the Chicago Brown Line stop. We also provide you with a free shuttle service to/from Ogilvie and Union.

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about UptakeFind similar jobs