Top Reliability Engineer Jobs in Chicago, IL

Reposted 11 Days AgoSaved
Remote or Hybrid
Chicago, IL
175K-200K Annually
Senior level
175K-200K Annually
Senior level
eCommerce • Fintech • Payments • Software
The role involves ensuring software reliability and performance, managing incidents, developing infrastructure automation, and mentoring junior engineers within a platform team.
Top Skills: AWSCloudFormationDatadogKubernetesOpentelemetryRubyRuby On RailsTerraform
Reposted 22 Days AgoSaved
Hybrid
Chicago, IL
114K-194K Annually
Expert/Leader
114K-194K Annually
Expert/Leader
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Lead the development of AI-driven software solutions to improve customer experience by addressing issues, monitoring performance, automating resolutions, and mentoring team members in a full stack development environment.
Top Skills: .Net.Net CoreAngularAWSC#CloudfrontDockerDynamoDBEcsElbGlue JobsJenkinsKubernetesLambdaMs SqlNoSQLPostgresPythonRdsReactRoute53S3SQLTerraformVue
9 Days AgoSaved
Remote
Chicago, IL
110K-120K Annually
Senior level
110K-120K Annually
Senior level
Aerospace • Information Technology • Software • Biotech • Design
Provide hands-on component reliability support for ECUs: select components, review BoMs against life requirements, coordinate vendors for data/tests, maintain risk lists, collect and format vendor test data, and drive root-cause coordination to close action items.
Top Skills: Bill Of Materials (Bom)Discrete ComponentsElectronic Control Units (Ecus)IcsPassive ComponentsTransistors
Reposted 11 Days AgoSaved
Remote
Chicago, IL
145K-180K Annually
Senior level
145K-180K Annually
Senior level
Legal Tech • Software
Lead automation and optimization of Filevine's data platform: performance tune MSSQL/Postgres, optimize Snowflake, provision infrastructure with Terraform/AWS, run stateful containers on Kubernetes, integrate AI/LLM and MCP for operational automation, manage CI/CD, capacity planning, documentation, and serve in 24/7 on-call rotation.
Top Skills: AWSC#DapperDockerDynamoDBEntity FrameworkGitlabKubernetesLlmsMcp (Model Context Protocol)Microsoft Sql Server (Mssql)Octopus DeployOpensearchPostgresPowershellPythonRedisSnowflakeTerraform
Reposted 20 Days AgoSaved
In-Office
Chicago, IL
Junior
Junior
Information Technology • Consulting
The Customer Reliability Engineer will analyze and provide predictive analytics for power generation and mining equipment, ensuring customer satisfaction and monitoring solutions.
Top Skills: Computer ProgrammingIndustrial EquipmentPredictive Analytics Software
Reposted 20 Days AgoSaved
In-Office
Chicago, IL
Junior
Junior
Information Technology • Consulting
The Customer Reliability Engineer analyzes the health and performance of various industrial equipment using predictive analytics software, requiring knowledge of engineering processes and equipment.
Top Skills: Predictive Analytics Software
Reposted 22 Days AgoSaved
In-Office
Chicago, IL
Junior
Junior
Information Technology • Software
The role involves delivering predictive analytics solutions for customer accounts, analyzing data, and mastering related software tools. Requires teamwork and customer management skills.
Top Skills: Computer ProgrammingIndustrial EquipmentPredictive Analytics SoftwareProcess EquipmentScripting
Reposted 22 Days AgoSaved
In-Office
Chicago, IL
Junior
Junior
Information Technology • Software
The Customer Reliability Engineer leverages expertise in mechanical engineering and IT to provide predictive analytics for industrial equipment maintenance and performance, primarily in power generation, oil and gas, and mining industries.
Top Skills: Computer ProgrammingMiningOil And GasPower GenerationPredictive Analytics SoftwareScripting
Reposted 14 Hours AgoSaved
In-Office
Chicago, IL
175K-225K Annually
Mid level
175K-225K Annually
Mid level
Fintech • Payments • Financial Services
The Site Reliability Engineer will automate processes, manage server deployments, and collaborate with teams to enhance operational workflows in a trading environment.
Top Skills: AnsibleC++ChefCloud InfrastructureDistributed SystemsDockerGoGrafanaHashicorp NomadHpc ClustersKubernetesLinuxPerlPodmanPrometheusPuppetPythonRancherRustSalt
Reposted 19 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 19 Days AgoSaved
Easy Apply
Remote
Chicago, IL
Easy Apply
Mid level
Mid level
Cloud • Security • Software • Cybersecurity • Automation
As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.
Top Skills: AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform
Reposted 14 Days AgoSaved
In-Office or Remote
Chicago, IL
Senior level
Senior level
Software
Drive reliability testing and qualification of cellular base stations, collaborating with R&D for long-term reliability and product lifecycle support.
Top Skills: ExcelMS OfficeMs WordPtc WindchillPythonTelcordia
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
In-Office or Remote
Chicago, IL
140K-205K Annually
Senior level
140K-205K Annually
Senior level
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Reposted 2 Days AgoSaved
In-Office
Chicago, IL
106K-156K Annually
Senior level
106K-156K Annually
Senior level
Fintech
Responsible for enhancing application infrastructure, ensuring reliability and scalability, automating processes, implementing observability, and collaborating with software development teams.
Top Skills: AWSDockerGitGoJavaJavaScriptKubernetesLinuxPythonRubySwarm
3 Days AgoSaved
In-Office
Chicago, IL
116K-174K Annually
Senior level
116K-174K Annually
Senior level
Fintech
Lead SRE work partnering with development teams to design and implement availability, scalability, observability, and automation for production systems. Build tooling, manage incident response and RCAs, optimize capacity and performance, mentor engineers, maintain runbooks, and participate in a 24x7 on-call rotation.
Top Skills: AuroraAWSChefCi/CdDockerDynamoDBGitGoIpJavaJavaScriptJenkinsJmsKafkaKubernetesLinuxMavenMemcachedMicroservicesObservabilityOraclePythonRedisRubySqsSwarmTcpUdp
3 Days AgoSaved
In-Office
Chicago, IL
140K-170K Annually
Senior level
140K-170K Annually
Senior level
Financial Services
Design, build, and operate reliable cloud infrastructure and networking (multi-account AWS, VPC, IAM). Implement IaC, CI/CD pipelines, observability (logging/metrics/alerting), automation, and reliability guardrails. Provide production support and incident response, perform root cause analysis, and collaborate with application teams to co-own system design and continuous improvement, using AI-assisted tools where appropriate.
Top Skills: .NetAi-Assisted Tools (Claude CodeAWSAws OrganizationsBashCi/CdCloudFormationElastic StackGitGithub CopilotIamInfrastructure As CodeJavaJenkinsNode.jsObservabilityOpensearchPowershellPythonTerraformVpcWindsurf)
Reposted 16 Days AgoSaved
Remote
Chicago, IL
146K-162K Annually
Senior level
146K-162K Annually
Senior level
Healthtech • Software
The Database Reliability Engineer manages and maintains cloud-based database infrastructures for SaaS applications, focusing on automation, process improvement, and collaboration with engineering teams.
Top Skills: AnsibleAWSAzureAzure Data FactoryC#DatabricksGCPGitGrafanaInfluxdbMySQLPostgresPowershellPythonSQLSQL ServerTerraform
Reposted 16 Days AgoSaved
Remote
Chicago, IL
75K-150K Annually
Senior level
75K-150K Annually
Senior level
Database • Analytics
As a Database Reliability Engineer at ClickHouse, you'll improve reliability, manage escalation processes, support incident response, and enhance database performance while collaborating across teams.
Top Skills: AWSAzureC++ClickhouseGoogle Cloud PlatformPythonShellSQL
17 Days AgoSaved
Remote or Hybrid
Chicago, IL
Senior level
Senior level
Software
Lead reliability engineering for Silicon Photonics hardware: define and validate reliability models, perform MTBF/MTBCF predictions, analyze field data, direct verification testing and root-cause analysis, drive corrective actions, and mentor cross-functional teams to improve product reliability.
Top Skills: Derating AnalysisDfmeaMtbcfMtbfSherlockSilicon PhotonicsTelcordiaThermal DesignWindchill Qs
4 Days AgoSaved
In-Office
Chicago, IL
106K-156K Annually
Senior level
106K-156K Annually
Senior level
Fintech
Design, build, and maintain scalable, reliable application infrastructure. Automate deployments and configuration, implement observability and monitoring, troubleshoot performance, advise development teams on SDLC and microservice best practices, create runbooks, participate in 24x7 on-call rotation, and ensure security and disaster recovery readiness.
Top Skills: AWSCi/CdDockerGitGoIpJavaJavaScriptKubernetesLinuxMonitoringObservabilityPythonRubyScripting LanguagesSecurity Encryption ProtocolsSwarmTcpUdp
4 Days AgoSaved
Remote or Hybrid
Chicago, IL
Senior level
Senior level
Information Technology • Software
Seek an SRE/Network Engineer with deep MAAS and bare-metal automation expertise to manage hundreds of nodes across distributed sites. Responsibilities include Linux administration, hardware-level diagnostics (BIOS/IPMI/RAID), network design (VLANs/L2-L3/VPN/UniFi), infrastructure automation (Ansible, Bash/Python, Git), observability (Prometheus/Grafana, ELK/Graylog/Loki), PXE/MAAS-based OS provisioning, API integrations, virtualization (OpenStack/Kolla-Ansible, Proxmox, VMware), and container workload support.
Top Skills: AnsibleBashBiosCloud-InitCloudflare ApiDebianElkGitGitopsGrafanaGraylogIpmiIronicKolla-AnsibleL2 RoutingL3 RoutingLinuxLokiMaasOpenstackPreseedPrometheusProxmox VePxePythonRaidUbuntuUnifiVlanVmware EsxiVpn
Reposted 22 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
23 Days AgoSaved
Easy Apply
Remote or Hybrid
Chicago, IL
Easy Apply
227K-272K Annually
Senior level
227K-272K Annually
Senior level
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Own and evolve Babylist's AWS infrastructure and developer platform using Terraform and Kubernetes. Improve CI/CD reliability, support engineers across environments, define monitoring and alerting standards, lead incident response and postmortems, and shape platform architecture to scale for millions of users.
Top Skills: AWSCdnCircleCICronitorDatadogDnsEksGithub ActionsKubernetesLoad BalancersMySQLPagerdutyRdsRedisRuby On RailsSentrySidekiqTerraform
23 Days AgoSaved
Easy Apply
Remote
Chicago, IL
Easy Apply
130K-140K Annually
Senior level
130K-140K Annually
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Lead SRE work to keep Circle highly available and performant: respond to incidents, own monitoring/alerting/log management, manage and optimize MySQL/Postgres/ClickHouse/Redis databases, maintain server infrastructure and deployment pipelines, collaborate with engineering teams, and build internal SRE tooling and automation.
Top Skills: AWSClickhouseKubernetesLlm-Based Tools (Copilots)MySQLPostgresRedis
18 Days AgoSaved
Remote
Chicago, IL
Mid level
Mid level
Information Technology • Software • Database • Automation
Owner of on-prem reliability and escalations: reproduce and resolve L2/L3 issues across heterogeneous Kubernetes environments, build diagnostics and automation, improve CI and e2e test stability, establish performance baselines, harden install/upgrade flows, and write tooling in Python/Go/Rust to reduce repeat incidents.
Top Skills: BenchmarkingCiCi/CdContainersE2E TestingGoHealth ChecksHelmInstallersIntegration TestingKubernetesLoad GenerationLogsMetricsNetworkingObservabilityPackagingProfilingPythonRbacRustStorageSupport BundlesTraces
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account