Get the job you really want.
Maximum of 25 job preferences reached.
Top Remote Reliability Engineer Jobs in Chicago, IL
Artificial Intelligence • Machine Learning
Own and modernize Domino's Tempest scale-testing platform; build repeatable automated validation, sizing guidance, and cloud-scale test automation; partner with platform teams to enable multi-cloud scale testing and improve test reliability and reporting.
Top Skills:
Ci SystemsCloud PlatformsCloud-Native ToolingEnd-To-End FrameworksKubernetesMulti-CloudPerformance/Load Testing FrameworksPythonTempest
Fintech
The Reliability Engineer will ensure high system availability and security while supporting deployments, incident response, and infrastructure strategies. They will collaborate with engineering teams, monitor performance, and develop automation scripts.
Top Skills:
AdoAWSAzureAzure MonitorBash ScriptingChefDatadogDockerGoogle Cloud PlatformJenkinsK8SPowershellPythonTerraformVeeamVmware Site Recovery
Reposted 4 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills:
AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead SRE responsible for architecting and automating fault-tolerant, scalable infrastructure across cloud and on-prem, driving deployment, monitoring, and performance tuning while mentoring engineers to improve reliability and SLAs.
Top Skills:
.NetAnsibleAWSAws GreengrassC#ChefDockerElixirGCPGitopsGoJavaKubernetesLinuxNutanixPythonRubyTerraformVsphere
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
Lead technical direction for software architecture and cross-team initiatives focusing on scaling consumer-facing systems and maximizing loan originations while maintaining compliance and system integrity.
Top Skills:
AWSCi/CdDockerGithub ActionsInfrastructure As CodeReactRuby On Rails
Fintech • Software
The Senior Site Reliability Engineer ensures fast, stable SaaS products through automation, collaboration, monitoring, and implementing AI tools to enhance performance and reliability.
Top Skills:
Ai ToolsAnsibleAppdynamicsAWSAzureAzure DevopsBashC# .NetCosmosDatadogDynatraceHarnessJavaJenkinsKubernetesNew RelicPowershellPythonSaaSSQLTerraform
Fintech • Financial Services
The Systems Reliability Engineer will support MEMX exchange platforms, handling incidents, improving processes, documenting actions, and debugging issues while collaborating with diverse teams to maintain operational efficiency.
Top Skills:
AnsibleBashChefLinuxPuppetPython
eCommerce • Retail • Software
Responsible for ensuring the availability and reliability of database systems, managing various databases, leading upgrades, and improving processes through automation and observability.
Top Skills:
AWSCi/Cd ToolingDynamoDBElasticsearchMongoDBMySQLPostgresPowershellPythonRedisSQL Server
Reposted 8 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The Site Reliability Engineer will enhance CI/CD frameworks, automate cloud infrastructure, manage Kubernetes and AWS services, and ensure operational excellence.
Top Skills:
AnsibleAWSBashChefCi/CdDockerGitKubernetesPuppetPythonRubySaltTerraform
Legal Tech • Software
Lead automation and optimization of Filevine's data platform: performance tune MSSQL/Postgres, optimize Snowflake, provision infrastructure with Terraform/AWS, run stateful containers on Kubernetes, integrate AI/LLM and MCP for operational automation, manage CI/CD, capacity planning, documentation, and serve in 24/7 on-call rotation.
Top Skills:
AWSC#DapperDockerDynamoDBEntity FrameworkGitlabKubernetesLlmsMcp (Model Context Protocol)Microsoft Sql Server (Mssql)Octopus DeployOpensearchPostgresPowershellPythonRedisSnowflakeTerraform
10 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.
Top Skills:
Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform
Reposted 10 Days AgoSaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Software
Own reliability, performance, and scalability of PostgreSQL infrastructure. Implement HA, replication, observability, capacity planning, automation, and DR. Support engineering teams with migrations, query optimization, on-call incident response, runbooks, and tooling to enable safe DB operations.
Top Skills:
AnsibleAuroraAws RdsChefDatadogDynamoDBElasticacheGoGrafanaIndexingMvccPatroniPgbouncerPostgresPrometheusPythonQuery PlannerReplicationRubySQLTerraformVacuum TuningWal
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Principal Software Engineer on the SRE team, lead best practices adoption, mentor engineers, and improve system reliability and user experience through automation and collaboration.
Top Skills:
CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Seeking a Senior Software Engineer, Site Reliability to ensure system stability, scalability, and reliability, while optimizing AWS infrastructure using modern DevOps practices and tools like Terraform, Docker, and Kubernetes.
Top Skills:
AWSCircleCICronitorDatadogDockerGithub ActionsJenkinsKubernetesMySQLPagerdutyReactRedisRuby On RailsSentrySidekiqTerraform
eCommerce • Retail • Software
The Senior Database Reliability Engineer ensures database availability, reliability, and efficiency, driving initiatives for upgrades, automation, and security while mentoring team members.
Top Skills:
AWSDynamoDBElasticsearchMongoDBMySQLPostgresPowershellPythonRedisSQL Server
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.
Top Skills:
AWSClickhouseKubernetesMySQLPostgresRedis
Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Design, scale, and manage AWS services for IoT devices. Collaborate on infrastructure, optimize performance, and ensure high availability of services.
Top Skills:
AWSBashGoHelmKubernetesPythonRubyTerraform
Food
The Reliability Engineer will manage maintenance of fixed assets, focusing on equipment reliability, predictive maintenance, and collaboration to reduce downtime and improve performance metrics of packaging operations.
Top Skills:
Automation EquipmentThermoforming Packaging MachinesTpm
Reposted 18 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills:
AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 18 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud infrastructure, managing CI/CD workflows, and ensuring operational excellence in IT support, including incident response and security practices.
Top Skills:
AnsibleAWSBashDockerGitKubernetesPythonRubyTerraform
Information Technology • Software
Seeking a Site Reliability Engineer to develop CI/CD solutions, automate infrastructure, manage Kubernetes clusters, and ensure system reliability through collaboration and technical decision-making.
Top Skills:
AnsibleAWSAzureCloudFormationDockerGitGrafanaKafkaKubernetesLinuxNatsPrometheusTerraformVMwareWindows
Top Chicago, IL Companies Hiring Remote Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results







.png)

.png)


















