Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Reliability Engineer Jobs in Chicago, IL

iManage

Customer Reliability Engineer

3 Days AgoSaved

Hybrid

Chicago, IL

103K-159K Annually

Mid level

103K-159K Annually

Mid level

Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software

Lead incident response and reliability improvements for iManage Cloud. Triage large-scale production issues, build observability and automation, run postmortems, partner with product and engineering, and proactively detect and eliminate systemic problems to improve uptime and customer experience.

Top Skills: AzureAzure Kubernetes Service (Aks)BashGrafanaKibanaPowershellPrometheusPythonRest ApisShellSplunkSQL

TransUnion

Staff Site Reliability Engineer

Reposted 12 Hours AgoSaved

Hybrid

Chicago, IL

113K-188K Annually

Senior level

113K-188K Annually

Senior level

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics

The Staff Site Reliability Engineer will lead reliability strategies, manage high-risk initiatives, and enhance engineering standards while ensuring system reliability and operational excellence within a hybrid work environment.

Top Skills: BashCi/CdDatabase ArchitectureGoGoogle Cloud PlatformInfrastructure-As-CodeKubernetesMonitoring PlatformsPulumiPythonTerraform

MongoDB

Site Reliability Engineer (Senior or Staff), Atlas

Reposted 12 Hours AgoSaved

Easy Apply

Remote or Hybrid

Chicago, IL

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.

Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls

Tempus AI

Site Reliability Engineer

2 Days AgoSaved

Hybrid

Chicago, IL

85K-130K Annually

Mid level

85K-130K Annually

Mid level

Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI

Join the SRE team to design, deploy, and operate resilient cloud infrastructure. Recommend solutions, automate workflows, configure Terraform and CI, implement monitoring and alerts, and support developers and users.

Top Skills: AnsibleAurora MysqlAWSAzureBashChefCloudFormationComposerConcourseDataprocDockerGCPGoHipaaHitrustIsoKubernetesPackerPostgresPuppetPythonRubySaltSlackTerraform

MongoDB

Senior Site Reliability Engineer, Fleet Management

Reposted 4 Days AgoSaved

Easy Apply

Remote or Hybrid

Chicago, IL

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.

Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform

Comcast

Sr. Site Reliability Engineer, Data - FreeWheel

5 Days AgoSaved

Hybrid

Chicago, IL

118K-176K Annually

Senior level

118K-176K Annually

Senior level

Digital Media • Information Technology • News + Entertainment

Responsible for ensuring reliability, scalability, and performance of data platforms. Design monitoring and alerting, automate deployments and recovery, optimize storage and query performance, troubleshoot incidents, plan capacity and scaling, document operations, enforce security/compliance, and collaborate with data engineering, product, and data science teams to maintain high availability of large-scale data systems.

Top Skills: AnsibleAWSAzureCi/CdDockerElk StackGCPGoGrafanaJavaKubernetesMySQLNoSQLPostgresPrometheusPythonScalaTerraform

Attain

Sr/Staff Site Reliability Engineer, Consumer Apps

Reposted 7 Days AgoSaved

Easy Apply

In-Office

Chicago, IL

Easy Apply

Mid level

AdTech

As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.

Top Skills: Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform

Dyson

Reliability Engineer

Reposted 2 Days AgoSaved

In-Office

Chicago, IL

82K-102K Annually

Entry level

82K-102K Annually

Entry level

Appliances • Manufacturing

The Reliability Engineer identifies and resolves product failures, analyzes data, collaborates with teams to enhance product reliability, and supports new product development for markets in the Americas.

Top Skills: Excel

Deepgram

Site Reliability Engineer - AI & ML Infrastructure (Kubernetes, AWS & Terraform)

Reposted 12 Hours AgoSaved

Remote

Chicago, IL

150K-220K Annually

Senior level

150K-220K Annually

Senior level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI

The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.

Top Skills: AWSBashGoKubernetesPythonSlurmTerraform

iManage

Senior Site Reliability Engineer

Reposted 11 Days AgoSaved

Hybrid

Chicago, IL

130K-180K Annually

Senior level

130K-180K Annually

Senior level

Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software

The Senior Site Reliability Engineer will focus on automating infrastructure, enhancing cloud resilience, supporting deployments, and mentoring teams in reliability best practices, while participating in on-call rotations.

Top Skills: AzureBashCi/CdDockerGoGrafanaJavaKubernetesPowershellPrometheusPythonRubyTerraform

The Hartford Financial Services Group, Inc.

Staff Reliability Engineer

Reposted 5 Days AgoSaved

In-Office

Chicago, IL

128K-191K Annually

Senior level

128K-191K Annually

Senior level

Fintech • Payments • Financial Services

The Staff Reliability Engineer will enhance data platform reliability through automation, incident management, and observability in a hybrid work setting.

Top Skills: AiopsAnsibleAWSCi/CdCloudFormationDatadogDynatraceEksEmrGCPGrafanaHadoopOpensearchPrometheusPythonSnowflakeSplunkTerraform

Domino Data Lab

Staff Site Reliability Engineer

Reposted 2 Days AgoSaved

Easy Apply

Remote or Hybrid

Chicago, IL

Easy Apply

200K-230K Annually

Senior level

200K-230K Annually

Senior level

Artificial Intelligence • Machine Learning

Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.

Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

Dropbox

Staff Site Reliability Engineer, Production Engineering

Reposted 2 Days AgoSaved

Remote

Chicago, IL

223K-302K Annually

Expert/Leader

223K-302K Annually

Expert/Leader

Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy

The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.

Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos

HiBob

Senior Site Reliability Engineer - Remote EST

Reposted 2 Days AgoSaved

Remote or Hybrid

Chicago, IL

190K-235K Annually

Senior level

190K-235K Annually

Senior level

HR Tech • Information Technology • Professional Services • Sales • Software

Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.

Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython

DraftKings

Principal Site Reliability Engineer

3 Days AgoSaved

Remote or Hybrid

Chicago, IL

200K-250K Annually

Senior level

200K-250K Annually

Senior level

Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics

Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.

Top Skills: Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform

Okta

Senior Database Reliability Engineer (DBRE)

Reposted 7 Days AgoSaved

In-Office

Chicago, IL

160K-220K Annually

Senior level

160K-220K Annually

Senior level

Cloud

The role involves designing, optimizing, and maintaining PostgreSQL and MySQL databases, ensuring high availability, reliability, and performance for mission-critical systems, while automating operational tasks and responding to incidents.

Top Skills: AnsibleAWSDatadogGCPGoGrafanaKubernetesMySQLPostgresPrometheusPythonTerraform

Northrop Grumman

Reliability Engineer / Principal Reliability Engineer

8 Days AgoSaved

In-Office

Chicago, IL

80K-148K Annually

Senior level

80K-148K Annually

Senior level

Aerospace • Logistics • Security • Software • Cybersecurity

Support IRCM product line by establishing, monitoring, and verifying subsystem reliability and maintainability requirements. Perform reliability predictions, FMECA, MTTR analysis, and FRACAS root cause/corrective action investigations. Ensure R&M program requirements are achieved and drive related projects and processes. On-site in Rolling Meadows, IL.

Top Skills: FmecaFracasIrcmMttrReliability And Maintainability (R&M)Systems Engineering

Coinbase

Staff Software Engineer, Core Reliability

7 Days AgoSaved

Easy Apply

Remote

Chicago, IL

Easy Apply

218K-257K Annually

Senior level

218K-257K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

Design, build, and launch reliability engineering projects to scale production services. Improve scalability, observability, secrets/configuration management, canary-based deployments, and deployment safety. Partner with teams, support critical services, participate in on-call rotations, and promote reliability best practices.

Top Skills: AWSAzureDatadogGCPGenerative AiGoKibanaRubyTerraform

Fuse Energy

Database Reliability Engineer

2 Days AgoSaved

In-Office or Remote

Chicago, IL

Mid level

Renewable Energy

Own reliability, performance, and scalability of Postgres and ClickHouse databases. Build scalable data pipelines, design analytical schemas and DBT models, migrate data to ClickHouse, implement data quality checks, eliminate duplicates, and manage database infrastructure via IaC.

Top Skills: Aws CdkAws Step FunctionsCi/CdClickhouseDagsterDbtPostgresPulumiPythonSQL

Zingtree

Senior DevOps / Platform Reliability Engineer

Reposted 2 Days AgoSaved

Remote

Chicago, IL

Senior level

Software

As a Senior DevOps / Platform Reliability Engineer, you will manage CI/CD pipelines, automate infrastructure, operate Kubernetes, and enhance observability while ensuring security and compliance for enterprise systems.

Top Skills: Argo CdAurora MysqlAWSBashCloudFormationEksElasticacheGithub ActionsGrafanaKubernetesLinuxMskOpentelemetryPrometheusPythonS3Terraform

Assured

Staff Database Reliability Engineer, DBRE

3 Days AgoSaved

Remote

Chicago, IL

165K-185K Annually

Senior level

165K-185K Annually

Senior level

Artificial Intelligence • Insurance • Software • Automation

Lead design, automation, and optimization of database infrastructure (PostgreSQL/Aurora). Build monitoring, tuning, and scaling strategies, create automation tooling, drive performance and reliability initiatives, and expand into broader SRE responsibilities to improve availability and system health for a growing SaaS platform.

Top Skills: Amazon AuroraCi/CdDockerJavaScriptKubernetesNode.jsPostgresPrismaRedshiftTerraformTerragruntTypescript

Oceaneering

Reliability Engineer

3 Days AgoSaved

In-Office or Remote

Chicago, IL

119K-178K Annually

Senior level

119K-178K Annually

Senior level

Automotive • Information Technology • Other • Transportation • Energy

Perform RAM and FMECA/FMEA analyses, develop fault trees and reliability predictions, support maintainability and logistics analyses, produce reliability growth test plans, contribute to systems engineering documentation, advise design engineers on R&M shortfalls, and present results to management and clients.

Top Skills: Fault Tree AnalysisFmeaFmecaIntegrated Logistics Support (Ils/Ilsa)Iso-9000Mil-Hdbk-217FRam ModellingRam SoftwareStatistical Methods

MongoDB

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

Reposted 9 Days AgoSaved

Easy Apply

Remote or Hybrid

Chicago, IL

Easy Apply

126K-248K Annually

Senior level

126K-248K Annually

Senior level

Big Data • Cloud • Software • Database

The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.

Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls

MongoDB

Staff Site Reliability Engineer, Fabric

Reposted 9 Days AgoSaved

Easy Apply

Remote or Hybrid

Chicago, IL

Easy Apply

127K-249K Annually

Expert/Leader

127K-249K Annually

Expert/Leader

Big Data • Cloud • Software • Database

Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.

Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns

Affirm

Software Engineer II, Backend (Reliability Platform)

10 Days AgoSaved

Easy Apply

Remote

Chicago, IL

Easy Apply

146K-225K Annually

Junior

146K-225K Annually

Junior

Big Data • Fintech • Mobile • Payments • Financial Services

Design and build a centralized reliability platform and developer-facing APIs. Implement AI agents for incident triage, log/trace summarization, and recommended actions. Own projects end-to-end and collaborate with product, infra, data, and SRE teams.

Top Skills: Ai FrameworksAPIsClaudeCopilotCursorLlmsPython