Finance of America Companies Logo

Finance of America Companies

Principal Production Engineer

Posted 3 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in United States
150K-250K Annually
Expert/Leader
Remote
Hiring Remotely in United States
150K-250K Annually
Expert/Leader
Lead enterprise production reliability, incident and problem management, and disaster recovery governance for mission-critical financial systems. Drive incident response, root cause remediation, observability, SLOs/SLIs, DR testing, compliance with SOX ITGC, and cross-team resilience initiatives.
The summary above was generated by AI

Purpose of Role

Responsible for enterprise production reliability, operational resilience, and disaster recovery governance within a regulated Financial Services environment. Provides strategic and hands-on technical leadership across Incident Management, Problem Management, DevOps and IT Service Continuity Management (ITSCM), ensuring mission-critical systems remain stable, recoverable, and compliant with SOX IT General Controls (ITGC) and regulatory expectations. Also defines reliability standards, leads high-priority incident response, eliminates systemic risk through structured root cause remediation, and governs the strategy and implementation of disaster recovery capabilities aligned to business impact and financial reporting integrity. Partners closely with Engineering, Infrastructure, Development, Business, Risk, Compliance, and Internal Audit to protect customer trust, operational continuity, and the organization’s risk posture.

Key Responsibilities and Expectations

  • Serves as senior escalation authority for high-priority production incidents.
  • Leads coordinated response efforts to restore services within defined Service Level Objectives (SLOs).
  • Ensures documented impact assessments for financially significant systems.
  • Drives blameless post-incident reviews and track remediation through formal governance processes.
  • Matures enterprise incident response frameworks and escalation models.
  • Partners with Change Enablement, Risk, and Engineering teams to reduce production risk and improve service stability.
  • Owns the end-to-end Problem Management lifecycle, including root cause analysis, known error documentation, and permanent corrective actions.
  • Identifies systemic control weaknesses and drives remediation to prevent repeat incidents.
  • Establishes structured reporting on recurring incidents, MTTR trends, and control effectiveness.
  • Defines and governs enterprise Disaster Recovery (DR) strategy,  and plans.
  • Ensures alignment of Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) with Business Impact Analysis (BIA).
  • Leads annual and periodic DR testing exercises, validation, and evidence documentation, and reports DR readiness and resilience metrics to senior leadership.
  • Coordinates with Business Continuity, Third-Party Risk, and Infrastructure teams to strengthen operational resilience.
  • Establishes enterprise production reliability standards and resilience frameworks.
  • Architects and enhances observability across applications and infrastructure (metrics, logs, traces).
  • Defines and monitors SLIs/SLOs, availability targets, and error budgets.
  • Drives automation to reduce operational toil and improve system scalability.
  • Partners in implementation and optimization of monitoring platforms (e.g., Datadog, New Relic, Elasticsearch, AWS native tools).
  • Integrates monitoring and alerting workflows with ITSM platforms (e.g., Jira Service Management) for automated ticketing and escalation.
  • Ensures Incident, Problem, Change, and DR processes support SOX ITGC design and operating effectiveness.
  • Maintains audit-ready documentation and evidence for regulatory and internal audit reviews.
  • Participates in control walkthroughs and audit engagements.
  • Identifies and remediates production control gaps impacting financial systems.
  • Establishes resilience metrics aligned to enterprise risk appetite and regulatory expectations.
  • Responds promptly and effectively to urgent business matters as they arise.
  • Performs other duties as assigned.

Reports To

  • VP, Technology Reliability and Release Engineering

Qualifications - Experience/Skills/Competencies

  • Minimum 10 years of relevant experience in Production Engineering, Disaster Recovery, DevOps, or Infrastructure Engineering.
  • Hands-on experience with the following tools and technologies, or comparable platforms: Observability: Datadog, New Relic, Elasticsearch, AWS CloudWatch;  Incident Management: JIRA Service Management, ITSM practices; CI/CD Tools: TeamCity, Octopus Deploy, Bitbucket, GitHub, Azure DevOps; Infrastructure: AWS (EC2, S3, Lambda, ECS, IAM, CloudFormation or Terraform); Backup and disaster recovery (DR) Tool: Rubrik.
  • Strong programming/scripting ability in one or more: Python, Bash, PowerShell, Go.
  • Experience building dashboards, KPIs, and reports for engineering and executive audiences.
  • Extensive knowledge of SRE frameworks, including SLOs, SLIs, MTTR, error budgets, and fault tolerance.
  • Extensive knowledge of Data Engineering principles, data lifecycle management, and data quality governance frameworks, ensuring reliability, accuracy, and integrity of enterprise data assets.
  • Strong interpersonal, verbal and written communication, and organizational skills.
  • Ability to manage multiple priorities simultaneously and deal with ambiguity.
  • Familiarity with one or more of compliance frameworks (ISO 22301, FFIEC, SOC 2, ISO 27001, ITIL etc.) preferred.
  • Experience working in environments where the operations and infrastructure behind websites (WebOps) are managed alongside content management platforms (e.g., Pantheon or WordPress) and a strong focus is placed on site speed, reliability, and user experience preferred.
  • Experience utilizing Monitoring Tools like Datadog, Elasticsearch preferred.

Qualifications - Education - Required

  • Bachelor's Degree

Qualifications - Education - Field(s)/Profession(s)

  • Computer Science or related technical field.
  • Vendor or industry standard certifications in applicable specialty or related technology areas.

Qualifications - Education - Preferred

  • Master's Degree

Compensation

The base salary range for this position is ($150,000 - $250,000) inclusive of all geographical differences in the labor market. The base salary for the position will be determined based on factors such as the candidate’s work location, skills, education, and experience. In addition to those factors, we believe in the importance of pay equity and consider the internal equity of our current team members in determining any final offer. We offer a competitive benefits package including health, dental, vision, life insurance, paid time-off benefits, flexible spending account, 401(k) with employer match, and ESPP.

Additional Information

The application deadline for this job opportunity is 5/1/2026. 

The above statements are intended to describe the general nature and level of work being performed by people assigned to this classification. They are not to be construed as an exhaustive list of all responsibilities, duties, and skills required of personnel so classified. 

Finance of America is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, sex (including pregnancy), sexual orientation, religion, creed, age, national origin, physical or mental disability, gender identity and/or expression, marital status, veteran status or other characteristics protected by law.

Equal Opportunity Employer
This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor.

Top Skills

Datadog,New Relic,Elasticsearch,Aws Cloudwatch,Jira Service Management,Teamcity,Octopus Deploy,Bitbucket,Github,Azure Devops,Aws Ec2,Aws S3,Aws Lambda,Aws Ecs,Aws Iam,Aws Cloudformation,Terraform,Rubrik,Python,Bash,Powershell,Go

Similar Jobs

An Hour Ago
In-Office or Remote
3 Locations
115K-125K Annually
Senior level
115K-125K Annually
Senior level
eCommerce • Mobile
Manage biweekly US payroll in Workday, validate earnings/deductions/taxes and garnishments, ensure multi-state compliance, support audits and year-end W-2s, troubleshoot Workday-ADP integrations, reconcile variances, produce payroll reports, document processes, and drive payroll process improvements.
Top Skills: Workday Payroll,Adp,Adp Smartcompliance,Excel
An Hour Ago
Easy Apply
Remote or Hybrid
Chicago, IL, USA
Easy Apply
90K-110K Annually
Senior level
90K-110K Annually
Senior level
Fintech • Software • Financial Services
Lead end-to-end program execution for follow trading and new instrument launches (e.g., SSF). Manage partner integrations, cross-functional launch readiness, success metrics, and stakeholder alignment to drive adoption, funded accounts, and engagement.
Top Skills: Apis,Platform Integrations,Trade Replication
7 Hours Ago
Remote or Hybrid
Detroit, MI, USA
30-30 Hourly
Internship
30-30 Hourly
Internship
Artificial Intelligence • Big Data • Cloud • Information Technology • Software • Big Data Analytics • Automation
Support design and development of onboarding, product, technical, and professional learning. Assist with needs analysis, instructional design (ADDIE), content creation (eLearning, ILT/vILT, videos), basic program evaluation, facilitation support, and project coordination under senior guidance.
Top Skills: Articulate Rise,Articulate Storyline,Canva,Vyond,Camtasia,Adobe Creative Cloud,Lms,Asana,Trello,Jira,Monday

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account