Positions in this function coordinate the processes and activities that focus on restoring service after an incident occurs. Monitors environment health and the facilitation of high severity incidents to improve the state of service availability and continuity. This function also includes those who operate and monitor computer and peripheral equipment. Coordinates the efforts of all functions to complete scheduled jobs in a timely manner. Documents all problems (hardware, application, facility, etc.) and takes corrective action as required. Interfaces with other departments as required.
- Strong desire to jump in and investigate issues, resolve problems, and communicate status
- Experience in incident management / root cause analysis
- Experience in system performance monitoring / management
- Proactively addresses problems and escalates appropriately
- Monitor Production environments and reacts to failures/outages per SOPs
- During a major incident, initiate and chair conference calls and drive incident to resolution as quickly as possible
- Assess criticality of failures/outages as it relates to application and business impact
- Escalate problems and roadblocks as they occur
- Provide status updates to IT Leadership and customers on current IT issues and actions being taken
- Focus on continuous improvement of the incident and problem management process, including inputs from and outputs to other IT Processes
- Identify and document production issues, raise and respond to tickets, resolve them or escalate issues to IT teams if needed
- Work with developers to analyze and identify root cause of application issues for remediation
- Work closely with management and other team members to meet project objectives and maintain maximum uptime
- Utilizing tracking software to track and monitor the resolution of issues and/or open tickets
- Prepares accurate documentation for communication and RCA.
- Communicate with diverse group of people, including technical and development personnel, senior management, business customers, and vendors with courtesy and in a constructive, professional manner.
- Act as an escalation point for the team for our senior leadership or business partners
- Ensure responsiveness as first level responder to support requests for technical support primarily through incoming calls, emails, and ticket system
- Coordinate with technical support teams to resolve or troubleshoot issues
- Troubleshoot and solve simple to highly complex hardware and software technical issues using logical troubleshooting approach with attention to detail
- Actively create and update knowledge base articles for internal/external use
- Act as a technical support liaison between our IT teams for high level issues
- Strong analytical and problem solving skills
- Ability to work as part of a team - Candidate will work closely with IT Infrastructure Teams and Development Organization.
- Document daily issues for handover call.
- Document timelines in case of major outages.
- Willing to work in a support function for other NOC Analysts who are managing active bridge calls (call OnCall Support, Open incident and/or problem tickets)
- Associates Degree or higher or equivalent work experience
- 2+ year of progressive IT Network Operations Center (NOC), DataCenter or Command Center experience
- Willing and able to work any shift, including weekends as needed, and on call rotations
- Dynatrace, Nagios, Application Dynamics and HP Site Scope or other similar application monitoring tools
- SolarWinds or other similar network monitoring tools
- Basic network trouble shooting i.e. ping, trace route, TCP/IP, etc…
- 1+ year of progressive IT service desk or IT support experience
All qualified applicants will receive consideration for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity, national origin, status as a qualified individual with a disability, or Vietnam era or other protected veteran.