Top Reliability Engineer Jobs in Chicago, IL

3 Days AgoSaved
Hybrid
Chicago, IL
103K-159K Annually
Mid level
103K-159K Annually
Mid level
Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software
Lead incident response and reliability improvements for iManage Cloud. Triage large-scale production issues, build observability and automation, run postmortems, partner with product and engineering, and proactively detect and eliminate systemic problems to improve uptime and customer experience.
Top Skills: AzureAzure Kubernetes Service (Aks)BashGrafanaKibanaPowershellPrometheusPythonRest ApisShellSplunkSQL
Reposted 12 Hours AgoSaved
Hybrid
Chicago, IL
113K-188K Annually
Senior level
113K-188K Annually
Senior level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Staff Site Reliability Engineer will lead reliability strategies, manage high-risk initiatives, and enhance engineering standards while ensuring system reliability and operational excellence within a hybrid work environment.
Top Skills: BashCi/CdDatabase ArchitectureGoGoogle Cloud PlatformInfrastructure-As-CodeKubernetesMonitoring PlatformsPulumiPythonTerraform
Reposted 12 Hours AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
2 Days AgoSaved
Hybrid
Chicago, IL
85K-130K Annually
Mid level
85K-130K Annually
Mid level
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI
Join the SRE team to design, deploy, and operate resilient cloud infrastructure. Recommend solutions, automate workflows, configure Terraform and CI, implement monitoring and alerts, and support developers and users.
Top Skills: AnsibleAurora MysqlAWSAzureBashChefCloudFormationComposerConcourseDataprocDockerGCPGoHipaaHitrustIsoKubernetesPackerPostgresPuppetPythonRubySaltSlackTerraform
Reposted 4 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
5 Days AgoSaved
Hybrid
Chicago, IL
118K-176K Annually
Senior level
118K-176K Annually
Senior level
Digital Media • Information Technology • News + Entertainment
Responsible for ensuring reliability, scalability, and performance of data platforms. Design monitoring and alerting, automate deployments and recovery, optimize storage and query performance, troubleshoot incidents, plan capacity and scaling, document operations, enforce security/compliance, and collaborate with data engineering, product, and data science teams to maintain high availability of large-scale data systems.
Top Skills: AnsibleAWSAzureCi/CdDockerElk StackGCPGoGrafanaJavaKubernetesMySQLNoSQLPostgresPrometheusPythonScalaTerraform
Reposted 7 Days AgoSaved
Easy Apply
In-Office
Chicago, IL
Easy Apply
Mid level
Mid level
AdTech
As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.
Top Skills: Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform
Reposted 2 Days AgoSaved
In-Office
Chicago, IL
82K-102K Annually
Entry level
82K-102K Annually
Entry level
Appliances • Manufacturing
The Reliability Engineer identifies and resolves product failures, analyzes data, collaborates with teams to enhance product reliability, and supports new product development for markets in the Americas.
Top Skills: Excel
Reposted 12 Hours AgoSaved
Remote
Chicago, IL
150K-220K Annually
Senior level
150K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills: AWSBashGoKubernetesPythonSlurmTerraform
Reposted 11 Days AgoSaved
Hybrid
Chicago, IL
130K-180K Annually
Senior level
130K-180K Annually
Senior level
Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software
The Senior Site Reliability Engineer will focus on automating infrastructure, enhancing cloud resilience, supporting deployments, and mentoring teams in reliability best practices, while participating in on-call rotations.
Top Skills: AzureBashCi/CdDockerGoGrafanaJavaKubernetesPowershellPrometheusPythonRubyTerraform
Reposted 5 Days AgoSaved
In-Office
Chicago, IL
128K-191K Annually
Senior level
128K-191K Annually
Senior level
Fintech • Payments • Financial Services
The Staff Reliability Engineer will enhance data platform reliability through automation, incident management, and observability in a hybrid work setting.
Top Skills: AiopsAnsibleAWSCi/CdCloudFormationDatadogDynatraceEksEmrGCPGrafanaHadoopOpensearchPrometheusPythonSnowflakeSplunkTerraform
Reposted 2 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
Remote
Chicago, IL
223K-302K Annually
Expert/Leader
223K-302K Annually
Expert/Leader
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
Reposted 2 Days AgoSaved
Remote or Hybrid
Chicago, IL
190K-235K Annually
Senior level
190K-235K Annually
Senior level
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
3 Days AgoSaved
Remote or Hybrid
Chicago, IL
200K-250K Annually
Senior level
200K-250K Annually
Senior level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.
Top Skills: Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform
Reposted 7 Days AgoSaved
In-Office
Chicago, IL
160K-220K Annually
Senior level
160K-220K Annually
Senior level
Cloud
The role involves designing, optimizing, and maintaining PostgreSQL and MySQL databases, ensuring high availability, reliability, and performance for mission-critical systems, while automating operational tasks and responding to incidents.
Top Skills: AnsibleAWSDatadogGCPGoGrafanaKubernetesMySQLPostgresPrometheusPythonTerraform
8 Days AgoSaved
In-Office
Chicago, IL
80K-148K Annually
Senior level
80K-148K Annually
Senior level
Aerospace • Logistics • Security • Software • Cybersecurity
Support IRCM product line by establishing, monitoring, and verifying subsystem reliability and maintainability requirements. Perform reliability predictions, FMECA, MTTR analysis, and FRACAS root cause/corrective action investigations. Ensure R&M program requirements are achieved and drive related projects and processes. On-site in Rolling Meadows, IL.
Top Skills: FmecaFracasIrcmMttrReliability And Maintainability (R&M)Systems Engineering
7 Days AgoSaved
Easy Apply
Remote
Chicago, IL
Easy Apply
218K-257K Annually
Senior level
218K-257K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Design, build, and launch reliability engineering projects to scale production services. Improve scalability, observability, secrets/configuration management, canary-based deployments, and deployment safety. Partner with teams, support critical services, participate in on-call rotations, and promote reliability best practices.
Top Skills: AWSAzureDatadogGCPGenerative AiGoKibanaRubyTerraform
2 Days AgoSaved
In-Office or Remote
Chicago, IL
Mid level
Mid level
Renewable Energy
Own reliability, performance, and scalability of Postgres and ClickHouse databases. Build scalable data pipelines, design analytical schemas and DBT models, migrate data to ClickHouse, implement data quality checks, eliminate duplicates, and manage database infrastructure via IaC.
Top Skills: Aws CdkAws Step FunctionsCi/CdClickhouseDagsterDbtPostgresPulumiPythonSQL
Reposted 2 Days AgoSaved
Remote
Chicago, IL
Senior level
Senior level
Software
As a Senior DevOps / Platform Reliability Engineer, you will manage CI/CD pipelines, automate infrastructure, operate Kubernetes, and enhance observability while ensuring security and compliance for enterprise systems.
Top Skills: Argo CdAurora MysqlAWSBashCloudFormationEksElasticacheGithub ActionsGrafanaKubernetesLinuxMskOpentelemetryPrometheusPythonS3Terraform
3 Days AgoSaved
Remote
Chicago, IL
165K-185K Annually
Senior level
165K-185K Annually
Senior level
Artificial Intelligence • Insurance • Software • Automation
Lead design, automation, and optimization of database infrastructure (PostgreSQL/Aurora). Build monitoring, tuning, and scaling strategies, create automation tooling, drive performance and reliability initiatives, and expand into broader SRE responsibilities to improve availability and system health for a growing SaaS platform.
Top Skills: Amazon AuroraCi/CdDockerJavaScriptKubernetesNode.jsPostgresPrismaRedshiftTerraformTerragruntTypescript
3 Days AgoSaved
In-Office or Remote
Chicago, IL
119K-178K Annually
Senior level
119K-178K Annually
Senior level
Automotive • Information Technology • Other • Transportation • Energy
Perform RAM and FMECA/FMEA analyses, develop fault trees and reliability predictions, support maintainability and logistics analyses, produce reliability growth test plans, contribute to systems engineering documentation, advise design engineers on R&M shortfalls, and present results to management and clients.
Top Skills: Fault Tree AnalysisFmeaFmecaIntegrated Logistics Support (Ils/Ilsa)Iso-9000Mil-Hdbk-217FRam ModellingRam SoftwareStatistical Methods
Reposted 9 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
126K-248K Annually
Senior level
126K-248K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted 9 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
127K-249K Annually
Expert/Leader
127K-249K Annually
Expert/Leader
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
10 Days AgoSaved
Easy Apply
Remote
Chicago, IL
Easy Apply
146K-225K Annually
Junior
146K-225K Annually
Junior
Big Data • Fintech • Mobile • Payments • Financial Services
Design and build a centralized reliability platform and developer-facing APIs. Implement AI agents for incident triage, log/trace summarization, and recommended actions. Own projects end-to-end and collaborate with product, infra, data, and SRE teams.
Top Skills: Ai FrameworksAPIsClaudeCopilotCursorLlmsPython
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account