Get the job you really want.
Maximum of 25 job preferences reached.
Top Remote Reliability Engineer Jobs in Chicago, IL
Artificial Intelligence
Lead reliability, scalability, security, and automation efforts for business-critical services. Build infrastructure-as-code, implement compliance (FedRAMP/IL5), plan roadmaps, optimize cost, and collaborate with security and architects.
Top Skills:
Ai ToolsAnsibleAWSAzureCmmcDod Impact Level 5FedrampGCPGoIl5JavaNist 800-53PulumiPythonRubyTerraform
Computer Vision • Information Technology • Machine Learning • Natural Language Processing • Real Estate • Software
The SRE will maintain infrastructure for SaaS products on AWS, support developers, manage platform components, and handle IT tasks.
Top Skills:
AWSComputer VisionIacLarge Language ModelsNlpTerraform
Artificial Intelligence • Information Technology • Consulting
As a Senior Site Reliability Engineer, you will enhance the reliability and performance of our inference platform, leveraging Kubernetes and Terraform while ensuring smooth scalability of systems under load.
Top Skills:
BashGrafanaKubernetesMlopsPrometheusPythonRayTerraformTritonVllm
Artificial Intelligence • Information Technology • Machine Learning • Software • Cybersecurity • Generative AI • Data Privacy
Lead global SRE and infrastructure teams to ensure reliability, scalability, and cost-efficiency of production and developer platforms. Define cloud and Kubernetes architecture, IaC, CI/CD, SLOs/SLIs, incident management, and cloud cost optimization while partnering with Security, Product, Finance, and Engineering.
Top Skills:
AIAutomationAWSCi/CdCloud-Native SystemsGCPInfrastructure As CodeKubernetesTerraform
Information Technology • Internet of Things • Software • Virtual Reality
Lead reliability, availability, and resiliency strategies for large-scale systems, drive operational excellence, and provide technical mentorship across engineering teams.
Top Skills:
AWSCi/CdJavaMongoDBRabbitMQZookeeper
Computer Vision • Machine Learning • Software
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Information Technology
The Lead Site Reliability Engineer will ensure platform reliability and performance, guiding SRE principles, managing incidents, and fostering collaboration across teams while leveraging cloud technologies and automation.
Top Skills:
AWSAzureAzure DevopsBashBicepCloudFormationGithub ActionsGoJenkinsPowershellPythonTerraform
Artificial Intelligence • Healthtech • Software
The Staff Site Reliability Engineer will lead the reliability of production systems by defining SRE practices, improving observability, and ensuring fault-tolerance in cloud environments.
Top Skills:
AWSGoKubernetesPostgresPythonTerraformTypescript
Reposted 5 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology
The Senior Site Reliability Engineer will ensure high availability of Vultr's control plane and infrastructure, focusing on reliability, automation, and observability for distributed systems.
Top Skills:
BgpGitlab Ci/CdGrafanaKvmLibvirtMySQLOpen VswitchPHPPuppetQemuSentrySumologic
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Legal Tech
Join the Engineering team to enhance cloud solutions, improve service reliability, automate tasks, and support software delivery and compliance initiatives.
Top Skills:
AnsibleAWSCircleCIDockerGoHerokuJenkinsKubernetesLogentriesNew RelicPostgresPythonRuby on RailsRedisRubyTerraformTwilio Sendgrid
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Cloud • Security • Software • Cybersecurity
The Senior Site Reliability Engineer will enhance performance and reliability of distributed systems, define KPIs, and collaborate cross-functionally to improve infrastructure and operational efficiency.
Top Skills:
AdbmsBashDatadogGrafanaInternet ProtocolsJavaScriptOracle SqlPrometheusPython
Cloud • Security • Software • Cybersecurity
The Senior Site Reliability Engineer will manage scalable systems on the ZTNA Cloud Platform, automate operations, optimize performance, and work with multiple teams to enhance security products.
Top Skills:
ApacheArgocdAWSCeleryElasticsearchHelmJenkinsKubernetesLinuxNginxOpensearchPostgresRabbitMQTerraformUbuntu
Cloud • Security • Software • Cybersecurity
The Senior Lead Site Reliability Engineer will ensure performance and uptime of security products, develop automation pipelines, and improve monitoring systems, working closely with various teams.
Top Skills:
AzureDatabricksDockerGoJenkinsKubernetesPythonTerraform
Information Technology • Consulting
The Senior Site Reliability Engineer is responsible for assessing and improving the reliability and operational resilience of enterprise infrastructure, focusing on stabilization strategies amid modernization initiatives.
Top Skills:
AWSCitrixGCPKubernetesAzure
Software
The Senior Site Reliability Engineer will ensure the reliability and performance of critical systems by improving observability, database performance, Kubernetes management, and CI/CD pipelines, while enhancing developer experience and infrastructure.
Top Skills:
Ci/CdElasticsearchGrafanaKibanaKubernetesPrometheusPythonSQL
Database • Analytics
This role involves ensuring the reliability and performance of ClickHouse's cloud infrastructure, collaborating with engineering teams, incident management, and driving continuous improvement in service availability.
Top Skills:
AnsibleAWSAzureClickhouseDocker SwarmGoGoogle Cloud PlatformKubernetesPuppetPythonTerraform
Edtech
The Lead Software Engineer will lead the SRE team, focusing on reliability, performance optimization, security, and mentoring developers, while improving overall platform resilience.
Top Skills:
ActivejobAnsibleAWSAws CloudwatchEc2EcsElasticsearchGitGCPGoogle Cloud StackdriverJenkinsJIRAKubernetesMemcachedMongoDBNew RelicNode.jsPostgresRedisRuby On RailsSidekiqSpinnakerTerraformTerragrunt
Artificial Intelligence • Cloud • Information Technology • Software
Build and operate production-grade AI infrastructure using Kubernetes, ensuring high availability, reliability, and performance. Develop custom operators and implement automation for efficient operations and monitoring.
Top Skills:
AnsibleBashElk StackEnterprise Storage SystemsGrafanaHigh-Performance NetworkingKubernetesLinuxNvidia Gpu TechnologiesPrometheusPythonTerraform
Legal Tech • Software
As a Site Reliability Engineer, you'll develop autonomous systems, improve CI/CD pipelines, mentor junior engineers, and ensure software reliability and security in a 24/7 environment.
Top Skills:
BashPowershellPython
Information Technology • Legal Tech
The role involves maintaining and improving Azure infrastructure, managing Infrastructure as Code with Terraform, enhancing security measures, and operating CI/CD pipelines.
Top Skills:
AzureAzure DevopsBashCircleCIDatadogEfkElkGithub ActionsPowershellPythonTerraform
News + Entertainment
As an Ads Reliability Engineer, you will ensure the reliability of Netflix's Ad Suite by designing scalable infrastructure, collaborating with teams, and implementing automation for monitoring and incident response.
Top Skills:
AWSAzureGCPGoJavaKubernetesPythonTerraform
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
The Senior Director of SRE leads and defines reliability and operational excellence across products, manages the SRE team, and scales reliability practices within the organization.
Top Skills:
AWSAzureCloud-Native NetworkingDistributed SystemsGCPKubernetesMicroservicesSite Reliability Engineering Principles
News + Entertainment
The Site Reliability Engineer will design and maintain infrastructure, improve software reliability, manage incidents, and promote engineering best practices across Netflix.
Top Skills:
AWSAzureGCPGoJavaKubernetesPythonTerraform
Top Chicago, IL Companies Hiring Remote Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results






























