Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Remote Reliability Engineer Jobs in Chicago, IL

MongoDB

Site Reliability Engineer (Senior or Staff), Storage Layer Services (SLS)

Reposted 19 Days AgoSaved

Easy Apply

Remote or Hybrid

United States

Easy Apply

126K-248K Annually

Senior level

126K-248K Annually

Senior level

Big Data • Cloud • Software • Database

The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.

Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls

MongoDB

Staff Site Reliability Engineer, Fabric

Reposted 19 Days AgoSaved

Easy Apply

Remote or Hybrid

United States

Easy Apply

127K-249K Annually

Expert/Leader

127K-249K Annually

Expert/Leader

Big Data • Cloud • Software • Database

Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.

Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns

Humana

Lead Data Engineer – Modernization & Reliability

Reposted 2 Days AgoSaved

In-Office or Remote

Chicago, IL, USA

142K-196K Annually

Mid level

142K-196K Annually

Mid level

Healthtech

The Lead Data Engineer modernizes and optimizes the Medicaid Market's data platform, manages ETL processes, and partners with the Business Intelligence team to enhance data accessibility and reliability, while also leading contract resources in a complex environment.

Top Skills: Azure Data FactoryCorepointDatabricksRhapsodySQL ServerSsisSsrs

Order.co

Senior Site Reliability Engineer

22 Days AgoSaved

Remote or Hybrid

United States

175K-200K Annually

Senior level

175K-200K Annually

Senior level

eCommerce • Fintech • Payments • Software

The role involves ensuring software reliability and performance, managing incidents, developing infrastructure automation, and mentoring junior engineers within a platform team.

Top Skills: AWSCloudFormationDatadogKubernetesOpentelemetryRubyRuby On RailsTerraform

Moonlite AI

Sr. Site Reliability Engineer (SRE)

Reposted 5 Days AgoSaved

In-Office or Remote

2 Locations

165K-225K Annually

Senior level

165K-225K Annually

Senior level

Artificial Intelligence • Cloud • Information Technology • Software

Build and operate production-grade AI infrastructure using Kubernetes, ensuring high availability, reliability, and performance. Develop custom operators and implement automation for efficient operations and monitoring.

Top Skills: AnsibleBashElk StackEnterprise Storage SystemsGrafanaHigh-Performance NetworkingKubernetesLinuxNvidia Gpu TechnologiesPrometheusPythonTerraform

Upstart

Principal Site Reliability Engineer

Reposted 24 Days AgoSaved

Easy Apply

Remote

United States

Easy Apply

195K-270K Annually

Expert/Leader

195K-270K Annually

Expert/Leader

Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software

As a Principal Software Engineer on the SRE team, lead best practices adoption, mentor engineers, and improve system reliability and user experience through automation and collaboration.

Top Skills: CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript

Hudson Information Technology and Manpower Services

Mechanical Engineer – Onshore Reliability

7 Days AgoSaved

In-Office or Remote

5 Locations

Expert/Leader

Agency • Information Technology • Professional Services • Software

Lead development and implementation of preventive and predictive maintenance for onshore mechanical equipment, use CMMS to plan and monitor maintenance, analyze reliability data, perform RCA, support operations and maintenance teams, ensure safety and compliance, and recommend improvements to reduce downtime and costs.

Top Skills: CmmsPredictive MaintenancePreventive MaintenanceRoot Cause Analysis

Hudson Information Technology and Manpower Services

Mechanical Engineer – Offshore Reliability

7 Days AgoSaved

In-Office or Remote

5 Locations

Expert/Leader

Agency • Information Technology • Professional Services • Software

Lead development and implementation of preventive and predictive maintenance programs for offshore mechanical equipment, use CMMS to plan and track work, perform RCA for failures, support offshore teams in troubleshooting, monitor equipment reliability, and ensure compliance with safety and maintenance standards.

Top Skills: CmmsPredictive MaintenancePreventive MaintenanceRoot Cause Analysis

Nokia

Reliability Engineer

Reposted 23 Days AgoSaved

In-Office or Remote

United States

Senior level

Software

Drive reliability testing and qualification of cellular base stations, collaborating with R&D for long-term reliability and product lifecycle support.

Top Skills: ExcelMS OfficeMs WordPtc WindchillPythonTelcordia

DexCare

Senior Site Reliability Engineer

YesterdaySaved

Remote

USA

125K-165K Annually

Senior level

125K-165K Annually

Senior level

Healthtech

Design, scale, and operate secure AWS cloud infrastructure (EKS, IAM, RBAC); build and maintain IaC (Terraform/Terragrunt), GitHub Actions CI/CD, Datadog observability, and Python automation; document runbooks, participate in on-call rotations, postmortems, and Agile workflows to improve reliability and security.

Top Skills: AWSDatadogEc2EksFargateGithub ActionsGithub Advanced SecurityHelmIamJIRAKubernetesLambdaPythonRbacSecrets ManagerServerlessTerraformTerragruntVpc

Renaissance Learning

Sr Site Reliability Engineer

YesterdaySaved

Remote

110K-151K Annually

Senior level

110K-151K Annually

Senior level

Edtech

Lead SRE work to improve availability, reliability, observability, and security for a distributed SaaS platform. Build and maintain IaC (Terraform, CloudFormation), support CI/CD, manage containerized production environments (Kubernetes/EKS), run disaster recovery exercises, participate in on-call rotation, collaborate cross-functionally, and mentor teams while integrating tooling including AI into SRE workflows.

Top Skills: .NetAnsibleAws EksCi/CdCloudFormationDockerJavaJavaScriptKubernetesPythonTerraform

Nametag

Senior Software Engineer – Infrastructure & Reliability

Reposted YesterdaySaved

Remote

USA

120K-190K Annually

Senior level

120K-190K Annually

Senior level

Enterprise Web • Information Technology • Mobile

The Senior Software Engineer will focus on infrastructure, reliability, and platform engineering, designing scalable systems, managing CI/CD processes, and evolving observability and incident response protocols.

Top Skills: AWSDistributed TracingFly.IoGithub ActionsGoLoggingMetricsPostgresTerraform

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

SimSpace

Staff Site Reliability Engineer

Reposted YesterdaySaved

Remote

U.S.

165K-230K Annually

Senior level

165K-230K Annually

Senior level

Information Technology • Security

The Staff Site Reliability Engineer will lead the architecture and security of the SimSpace cyber range platform, focusing on reliability, automation, and observability across diverse deployment environments while mentoring engineers and driving infrastructure initiatives.

Top Skills: ArgocdGithub ActionsGoGrafana TankaJsonnetKubernetesPython

Andromeda (andromeda.ai)

Staff SRE, AI Infrastructure

Reposted YesterdaySaved

In-Office or Remote

USA

Senior level

Artificial Intelligence • Cloud • Information Technology • Software

As a Staff SRE, you will ensure the reliability and performance of Andromeda's GPU infrastructure, lead incident responses, build observability systems, and mentor engineers, while collaborating closely with engineering and customers.

Top Skills: AnsibleCudaGoHelmKubernetesLinuxNcclNvidiaPythonRustSlurmTerraform

Arista Networks

FedRAMP Site Reliability Engineer (FedSRE) - CloudVision

Reposted YesterdaySaved

Remote

101K-161K Annually

Senior level

101K-161K Annually

Senior level

Cloud • Software • Analytics

Join Arista Networks as a Site Reliability Engineer to manage CloudVision service reliability, scalability, and stability in a FedRAMP environment, focusing on areas like architecture, security, and performance optimization.

Top Skills: AnsibleBashGCPGkeGoKubernetesPulumiPython

Metabase

Senior SRE/DevOps Engineer

Reposted YesterdaySaved

Remote

United States

Senior level

Big Data

You will manage AWS infrastructure, automate deployments, debug application issues, and improve the operational health of Metabase Cloud.

Top Skills: AWSDatadogGoGrafanaKubernetesPrometheusPythonTerraform

Vynca Inc

Site Reliability Engineer

2 Days AgoSaved

Remote

United States

140K-150K Annually

Mid level

140K-150K Annually

Mid level

Healthtech

Design, provision, and operate AWS infrastructure using Terraform; run and scale Kubernetes workloads with Helm; build observability, monitoring, and CI/CD automation; define SLIs/SLOs and lead incident response and postmortems; implement security and compliance (HIPAA/SOC2); participate in on-call rotation and partner with product and engineering on capacity, performance, and resilient system design.

Top Skills: ArgocdAWSAws Secrets ManagerCi/CdClickhouseCloudwatchDatadogEvent SourcingFluxGoGrafanaHashicorp VaultHelmKubernetesLinuxMySQLOpentelemetryPostgresPrometheusPythonRedshiftSignozSnowflakeTerraform

CentralSquare Technologies

Lead Site Reliability Engineer - Remote

Reposted 3 Days AgoSaved

Remote

United States

Senior level

Software

The role involves designing, building, and maintaining AWS infrastructure, implementing IaC, developing CI/CD pipelines, automating operations, and enhancing network and security practices.

Top Skills: AWSBashCi/CdCloudFormationDockerKubernetesPowershellPythonTerraform

Luma AI

Senior Site Reliability Engineer

Reposted 4 Days AgoSaved

In-Office or Remote

2 Locations

170K-290K Annually

Expert/Leader

170K-290K Annually

Expert/Leader

Artificial Intelligence • Software

As a Software Engineer in Reliability, you'll architect and manage multi-cloud GPU infrastructure, ensuring performance, security, and scale while debugging complex hardware/software issues.

Top Skills: AmdAWSBashGoGpuInfinibandLinuxNvidiaOciPythonRdma

Filevine

Senior Site Reliability Engineer - GCP

Reposted 4 Days AgoSaved

Remote

United States

Expert/Leader

Legal Tech • Software

As a Site Reliability Engineer, you'll develop autonomous systems, improve CI/CD pipelines, mentor junior engineers, and ensure software reliability and security in a 24/7 environment.

Top Skills: BashPowershellPython

CoverMyMeds

Sr. Database Site Reliability Engineer (DB SRE)

Reposted 4 Days AgoSaved

In-Office or Remote

USA

132K-221K Annually

Senior level

132K-221K Annually

Senior level

Healthtech • Information Technology • Software

The Sr. Database Site Reliability Engineer manages the reliability and performance of Azure PostgreSQL platforms, applying SRE principles for automation and observability. Responsibilities include incident response, backup strategies, and ensuring compliance with security standards.

Top Skills: ArgocdAzure PostgresqlCi/CdDatadogGitHelmKubernetesTerraform

Xpert Development LLC

Senior DevOps & Site Reliability Engineer

5 Days AgoSaved

Remote

United States

165K-190K Annually

Senior level

165K-190K Annually

Senior level

Artificial Intelligence • Information Technology • Software • Automation

Own US PST coverage for releases and incidents as the first SRE; bridge infrastructure and code by working with Kubernetes, Terraform, and AWS and patching Elixir when needed; lead incident response and post-mortems; define SLOs and observability; author runbooks and support HIPAA-aligned compliance for a regulated medical-device platform.

Top Skills: AWSElixirKubernetesTerraform

Cooley

Senior Technology Site Reliability Engineer

Reposted 14 Days AgoSaved

In-Office or Remote

2 Locations

140K-205K Annually

Senior level

140K-205K Annually

Senior level

Information Technology • Legal Tech

The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.

Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform

MixMode

Sr. Software Engineer-AI Reliability

Reposted 5 Days AgoSaved

Remote

USA

150K-210K Annually

Senior level

150K-210K Annually

Senior level

Big Data • Cybersecurity

The Senior Software Engineer will enhance AI system reliability, performance, and scalability, focusing on distributed services and collaborating with ML researchers.

Top Skills: JavaKotlinKubernetesLoggingMetricsPythonRelational DatabasesScalaTracing

Nebius

Site Reliability Engineer

Reposted 5 Days AgoSaved

Remote

United States

100K-140K Annually

Mid level

100K-140K Annually

Mid level

Artificial Intelligence • Information Technology • Consulting

The Linux Systems Administrator will maintain and troubleshoot Linux systems, support network services, and work on systems integration while collaborating with infrastructure teams.

Top Skills: DhcpDnsLinuxNtpPython