Get the job you really want.
Maximum of 25 job preferences reached.
Top Remote Reliability Engineer Jobs in Chicago, IL
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Senior Site Reliability Engineer will manage deployments, operations, and incident handling for large-scale AI GPU platforms while ensuring high performance and resilience in configurations.
Top Skills:
C++KubernetesLinuxPython
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Design and maintain large scale Kubernetes clusters, ensuring reliability through monitoring, automation and incident response.
Top Skills:
DockerGoKubernetesLinuxNetworkingOpenstackPerlPythonRuby
Marketing Tech
The Cloud Reliability Engineer develops, configures, and deploys cloud tools, enhances applications, ensures observability, and participates in on-call rotations.
Top Skills:
AWSCi/CdDockerGithub ActionsGoGoogle BigqueryGCPKubernetesLinuxPythonSQLTerraform
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The role involves architecting and operating large-scale observability systems, designing resilient telemetry pipelines, automating operations, and leading incident responses while collaborating with various teams.
Top Skills:
ElasticsearchFlinkGoJaegerKafkaLokiMimirOpensearchOpentelemetryPrometheusPythonSparkTempoThanos
3 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology • Security • Software • Cybersecurity
The role involves managing high-impact customer escalations, acting as a liaison between engineering and support, debugging complex cloud issues, and enhancing product reliability.
Top Skills:
C ProgrammingCurlDockerKubernetesLinuxPostmanTcp/IpUnix
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Principal Staff SRE will lead initiatives in building and optimizing core infrastructure services on-prem and cloud, deploying and managing services at scale, and improving performance with automation and monitoring tools.
Top Skills:
DhcpDnsEbpfGoLdapLinuxNtpPythonTerraformXdp
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Senior Site Reliability Engineer (SRE) at NVIDIA is responsible for designing, building, and maintaining large-scale production systems, focusing on reliability and efficiency, automation, and continuous improvement.
Top Skills:
ContainersGoKubernetesLinuxNetworkingOpenstackPerlPythonRuby
Reposted 3 Days AgoSaved
Easy Apply
Easy Apply
Financial Services
As a Software Engineer, enhance the reliability of trading systems focusing on software engineering initiatives for performance and automation.
Top Skills:
JavaKotlinLinuxUnix
Software
The Lead Site Reliability Engineer will oversee the architecture and operational excellence of Mattermost's infrastructure, mentoring teams and driving strategic initiatives for performance and reliability in regulated sectors.
Top Skills:
AWSGrafanaKubernetesPrometheusTerraform
Information Technology
As a Site Reliability Engineer at New Era Technology, you'll focus on ensuring operational efficiency, creating reliable systems, and enhancing service performance through AWS expertise.
Top Skills:
AWS
Software
As a Senior Site Reliability Engineer at Regrello, you'll shape the developer platform, collaborate with customers, and ensure the reliability and security of infrastructure and applications.
Top Skills:
AWSAzureCircleCIGCPGithub ActionsGitlab CiGoKubernetesTerraform
Artificial Intelligence • Real Estate
As a Senior Site Reliability Engineer, you will enhance platform reliability and observability, streamline incident response, improve cloud infrastructure, and collaborate across teams to drive operational excellence.
Top Skills:
AWSCircleCIDatadogGithub ActionsGrafanaPrometheusTerraform
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Software
The Senior Site Reliability Engineer will ensure the reliability and performance of critical systems by improving observability, database performance, Kubernetes management, and CI/CD pipelines, while enhancing developer experience and infrastructure.
Top Skills:
Ci/CdElasticsearchGrafanaKibanaKubernetesPrometheusPythonSQL
Reposted 5 Days AgoSaved
Security • Cybersecurity
The role involves leading initiatives for managing federal cloud environments, focusing on DevOps best practices, infrastructure reliability, and mentoring junior engineers.
Top Skills:
AWSGoKafkaKubernetesPythonRedisTerraform
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills:
ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Aerospace • Manufacturing
The Staff Site Reliability Engineer will design and manage Aalyria's centralized observability platform, focus on metrics, logging, and tracing systems, implement SLOs and SLIs, automate deployments, and drive incident response strategies for enhanced reliability across satellite and cloud platforms.
Top Skills:
AWSElkGCPGitopsGoGrafanaJaegerJavaKubernetesLokiOpentelemetryPrometheusPythonTempoTerraform
Automotive
Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.
Top Skills:
AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform
Reposted 5 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Information Technology • Logistics • Machine Learning • Software
Lead reliability initiatives for the production platform, manage incident response, define SLIs/SLOs, and enhance security by embedding it into delivery pipelines. Drive platform improvements in AWS and CI/CD processes.
Top Skills:
AuroraAWSBazelCi/CdDagsterDbtDuckdbDynamoDBEcsJavaJavaScriptKubernetesPythonSpaceliftSqsSsmTerraformTrinoTypescript
Big Data • Healthtech • Information Technology • Analytics
As a Lead Site Reliability Engineer, you'll design and manage scalable cloud infrastructure on GCP, optimize CI/CD processes, and ensure system reliability through observability and incident response, while mentoring others in a cross-product SRE group.
Top Skills:
BashGitlab Ci/CdGkeGoogle Cloud PlatformJenkinsPythonSentrySumo LogicTerraform
Cloud • Fintech • Information Technology • Software • Business Intelligence
As a Site Reliability Engineer, you will ensure production system reliability, optimize performance, respond to incidents, and collaborate on infrastructure improvements.
Top Skills:
AnsibleAWSBashDatadogDockerElkGitGrafanaKubernetesNew RelicOpentelemetryPrometheusPythonReactRubyRuby On RailsTerraform
Healthtech • Social Impact
As a Senior Site Reliability Engineer at Virta Health, you'll build automation and tooling for reliability, enhance observability, and mentor engineering teams in best practices.
Top Skills:
AIAiopsGoMlPythonTerraform
Information Technology • Software • Web3
As a Software Engineer focused on SRE and DevSecOps, you will design scalable infrastructure, implement CI/CD pipelines, and automate processes while collaborating with teams to enhance performance and security.
Top Skills:
AnsibleBashDatadogDockerGCPGrafanaKubernetesPythonReactRustSolidityTerraformWeb3
Cloud • Security • Software
The Site Reliability Engineer will design, automate and scale cloud infrastructure while ensuring uptime, performance, and security best practices.
Top Skills:
AnsibleAWSAzureChefDockerGCPGoJavaScriptKubernetesLinuxPuppetPythonRubySaltstackTerraform
Hardware • Machine Learning • Security • Software
The Site Reliability Engineer will manage software deployment for IoT devices, improve observability, maintain dashboards, automate processes, and collaborate on incident responses.
Top Skills:
AnsibleAWSBashC/C++DatadogGrafanaGroovyJavaJavaScriptNoSQLPostgresPrometheusPythonRSigmaSQLTerraform
Artificial Intelligence • Cloud • Fintech • Machine Learning • Mobile • Software
The Staff Site Reliability Engineer will design, implement, and optimize infrastructure for AI services, ensure reliability and performance, and drive automation and observability excellence across engineering teams.
Top Skills:
AzureAzure DevopsDockerElk StackGithub ActionsGrafanaKubernetesMimirPostgresPrometheusSQL ServerTeamcityTerraform
Top Chicago, IL Companies Hiring Remote Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results






























