Top Senior Site Reliability Engineer Jobs in Chicago, IL

Reposted 7 Days AgoSaved
In-Office or Remote
Chicago, IL
160K-179K Annually
Senior level
160K-179K Annually
Senior level
Fintech • Payments
The Senior Staff SRE leads reliability engineering initiatives, drives operational excellence, mentors staff, and influences architecture to enhance system reliability and performance.
Top Skills: Ai/MlAWSAzureDockerElk StackGCPGrafanaKubernetesMySQLNoSQLPostgresSplunk
8 Days AgoSaved
Hybrid
Chicago, IL
165K-288K Annually
Senior level
165K-288K Annually
Senior level
Artificial Intelligence • Cloud • Fintech • Information Technology • Analytics • Financial Services • Cybersecurity
Lead adoption and standardization of SRE practices across the enterprise. Establish SRE governance, define reliability metrics (SLOs/SLIs), build a Community of Practice, run training/forums, enable automation and tooling, partner with platform teams on observability, chaos engineering, and self-healing, and drive cross-functional alignment for resilience and incident management.
Top Skills: AutomationAzure MonitorChaos EngineeringCi/CdCloud-NativeDevOpsDynatraceHybrid ArchitecturesIncident ManagementObservabilityPlatform EngineeringPrometheusRelease EngineeringSelf-HealingSplunkSre
Reposted 9 Days AgoSaved
Hybrid
Chicago, IL
245K-270K Annually
Senior level
245K-270K Annually
Senior level
Information Technology • Consulting
As a Senior Staff Site Reliability Engineer, you will lead the SRE team, advocate best practices, ensure resilience in cloud architecture, and mentor team members.
Top Skills: ArgocdCircleCIGoogle Cloud PlatformKubernetesPulumiTerraformTypescript
Reposted 14 Hours AgoSaved
Remote
Chicago, IL
115K-135K Annually
Mid level
115K-135K Annually
Mid level
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills: ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Reposted 14 Hours AgoSaved
Remote
Chicago, IL
Senior level
Senior level
Artificial Intelligence • Fintech • Software • Financial Services
The SRE will own reliability for a cloud-native platform, optimizing performance, availability, and observability, while mentoring engineering teams.
Top Skills: AWSClickhouseGoKafkaKubernetesPulumiPythonTerraform
Reposted YesterdaySaved
Remote
Chicago, IL
Mid level
Mid level
Blockchain • Software
Build, operate, and scale production Kubernetes infrastructure using GitOps and declarative IaC. Design CI/CD workflows, observability, and secure-by-default systems. Troubleshoot networking/storage, participate in on-call rotations, automate operational workflows, and drive postmortems and reliability improvements.
Top Skills: ArbitrumArgocdArgocd ApplicationsetsAWSAzureBashCloudwatchCodebuildGCPGithub ActionsGitopsGoGrafanaK9SKubernetesLinuxLokiMimirPrometheusPrysmPythonTerraformYamlZerodev
Reposted 11 Days AgoSaved
In-Office or Remote
Chicago, IL
165K-225K Annually
Senior level
165K-225K Annually
Senior level
Artificial Intelligence • Cloud • Information Technology • Software
Build and operate production-grade AI infrastructure using Kubernetes, ensuring high availability, reliability, and performance. Develop custom operators and implement automation for efficient operations and monitoring.
Top Skills: AnsibleBashElk StackEnterprise Storage SystemsGrafanaHigh-Performance NetworkingKubernetesLinuxNvidia Gpu TechnologiesPrometheusPythonTerraform
Reposted YesterdaySaved
Remote
Chicago, IL
Senior level
Senior level
Automotive
Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.
Top Skills: AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform
2 Days AgoSaved
Remote
Chicago, IL
Senior level
Senior level
Software • Web3
Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.
Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers
2 Days AgoSaved
Remote
Chicago, IL
124K-171K Annually
Senior level
124K-171K Annually
Senior level
Healthtech • Pharmaceutical • Manufacturing
Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.
Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform
Reposted 2 Days AgoSaved
In-Office or Remote
Chicago, IL
200K-200K Annually
Mid level
200K-200K Annually
Mid level
Cloud • Software
The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.
Top Skills: KubernetesLinuxOpenstackPython
Reposted 2 Days AgoSaved
In-Office or Remote
Chicago, IL
95K-171K Annually
Junior
95K-171K Annually
Junior
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
Remote
Chicago, IL
Mid level
Mid level
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills: AWSKubernetesTerraformTerragrunt
Reposted 2 Days AgoSaved
Remote
Chicago, IL
220K-250K Annually
Expert/Leader
220K-250K Annually
Expert/Leader
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills: AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
Reposted 2 Days AgoSaved
Remote
Chicago, IL
133K-211K Annually
Mid level
133K-211K Annually
Mid level
Cloud • Security • Software • Generative AI
Design, build, and automate large-scale multi-cloud infrastructure and internal SRE tools. Improve host lifecycle, observability, alerting, and reliability; operate containerized workloads; participate in on-call rotations, incident response, runbooks, postmortems, code reviews, and mentoring.
Top Skills: AnsibleArgo CdArgo WorkflowsCueDockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusPuppetTerraformUbuntuUbuntu Live Patch
Reposted 2 Days AgoSaved
In-Office or Remote
Chicago, IL
165K-215K Annually
Senior level
165K-215K Annually
Senior level
Software • Cybersecurity
This role involves managing Kubernetes clusters, cloud infrastructure, and CI/CD pipelines. The engineer will enhance system reliability and efficiency while troubleshooting production issues.
Top Skills: AlertmanagerAWSAzureBashCi/CdDockerElastic StackElasticsearchGCPGoGrafanaHelmKafkaKubernetesLokiMongoDBOciPrometheusPythonRedisSparkTerraform
Reposted 2 Days AgoSaved
Remote or Hybrid
Chicago, IL
160K-180K Annually
Senior level
160K-180K Annually
Senior level
Artificial Intelligence • Machine Learning • Software • Analytics
The role involves end-to-end ownership of AWS infrastructure, managing Kubernetes platforms, and ensuring system reliability through observability and automation. Responsibilities include incident response and maintaining CI/CD systems.
Top Skills: ArgocdAWSDatadogGitGoKubernetesPythonTerraform
Reposted 2 Days AgoSaved
Remote
Chicago, IL
Mid level
Mid level
Software • Consulting
The Senior Application Support Engineer leads efforts to ensure application reliability, manages incidents, collaborates with teams, and monitors performance, providing 24/7 support.
Top Skills: AppdynamicsAWSDatadogLinuxMulesoftOpentelemetryPythonServicenowSplunk
Reposted 3 Days AgoSaved
In-Office or Remote
Chicago, IL
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Software
The Site Reliability Engineer will provision and manage Kubernetes clusters, build automation tools, debug customer issues, and improve infrastructure reliability.
Top Skills: AnsibleBashDatadogGoGrafanaHelmKubernetesLokiPrometheusPythonTerraform
4 Days AgoSaved
Remote
Chicago, IL
180K-224K Annually
Senior level
180K-224K Annually
Senior level
Artificial Intelligence • Information Technology • Consulting
Build and operate Nebius's network infrastructure: define SLIs/SLOs, improve site and inter-site reliability, lead incident response and postmortems, develop observability and alerting, automate change workflows, and collaborate with network and platform teams to embed operability.
Top Skills: Ci/CdContainer PlatformsGoInfrastructure As CodeLinuxPython
4 Days AgoSaved
In-Office or Remote
Chicago, IL
76K-136K Annually
Mid level
76K-136K Annually
Mid level
Cloud • Security • Software • Cybersecurity
Design, develop, test, and operate scalable infrastructure and services for Akamai Cloud. Implement and manage Infrastructure-as-Code (Terraform and similar tools), CI/CD, and observability. Automate reliability improvements, mentor engineers, collaborate on incident response and root-cause remediation, and participate in on-call rotations.
Top Skills: Alerting)AnsibleChefCi/CdInfrastructure As CodeLinuxLoggingObservability (MonitoringPuppetSaltstackTerraform
4 Days AgoSaved
Remote
Chicago, IL
130K-160K Annually
Senior level
130K-160K Annually
Senior level
Other
Design, build, and maintain highly available cloud-native systems. Improve reliability through automation, CI/CD, Kubernetes, observability, and incident management. Collaborate with developers, security, and product teams to define SLOs, implement self-healing, debug production issues, and ensure secure deployments.
Top Skills: AWSAzure Cloud ServicesDatadogGCPGithub ActionsGitlab CiGoInfrastructure As CodeKubernetesOpsgeniePagerdutyPythonRubySite Reliability Engineering Foundation
4 Days AgoSaved
Remote
Chicago, IL
Junior
Junior
Software
Support senior SREs to maintain availability, performance, and reliability of VA enterprise platforms. Assist with monitoring, incident response, automation, CI/CD, cloud/container operations (AWS, containers), documentation, and security/compliance under Federal requirements while developing SRE skills.
Top Skills: AWSAzureBashCi/CdCloudwatchDockerEcsEksElkGitGCPGrafanaKubernetesLinuxPowershellPrometheusPythonSplunkTerraform
Reposted 11 Days AgoSaved
Remote or Hybrid
Chicago, IL
175K-200K Annually
Senior level
175K-200K Annually
Senior level
eCommerce • Fintech • Payments • Software
The role involves ensuring software reliability and performance, managing incidents, developing infrastructure automation, and mentoring junior engineers within a platform team.
Top Skills: AWSCloudFormationDatadogKubernetesOpentelemetryRubyRuby On RailsTerraform
Reposted 5 Days AgoSaved
In-Office or Remote
Chicago, IL
Senior level
Senior level
Software
The role involves managing compute infrastructure for decentralized applications, requiring critical thinking, documentation skills, and experience in Kubernetes and blockchain management.
Top Skills: BlockchainGitopsInfrastructure-As-CodeKubernetesProgramming Languages
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account