Maximum of 25 job preferences reached.
Top Remote Reliability Engineer Jobs in Chicago, IL
Healthtech
The Lead Data Engineer modernizes and optimizes the Medicaid Market's data platform, manages ETL processes, and partners with the Business Intelligence team to enhance data accessibility and reliability, while also leading contract resources in a complex environment.
Top Skills:
Azure Data FactoryCorepointDatabricksRhapsodySQL ServerSsisSsrs
Legal Tech • Software
Lead automation and optimization of Filevine's data platform: performance tune MSSQL/Postgres, optimize Snowflake, provision infrastructure with Terraform/AWS, run stateful containers on Kubernetes, integrate AI/LLM and MCP for operational automation, manage CI/CD, capacity planning, documentation, and serve in 24/7 on-call rotation.
Top Skills:
AWSC#DapperDockerDynamoDBEntity FrameworkGitlabKubernetesLlmsMcp (Model Context Protocol)Microsoft Sql Server (Mssql)Octopus DeployOpensearchPostgresPowershellPythonRedisSnowflakeTerraform
24 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Reposted 5 Days AgoSaved
Travel
The Senior Site Reliability Engineer will automate and optimize infrastructure on Google Cloud, improve cost efficiency, and support on-call incidents, working closely with the engineering teams.
Top Skills:
BashContainersDatadogGCPHelmIstioKubernetesKustomizePythonSQL
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.
Top Skills:
Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform
Software
Own reliability, performance, and scalability of PostgreSQL infrastructure. Implement HA, replication, observability, capacity planning, automation, and DR. Support engineering teams with migrations, query optimization, on-call incident response, runbooks, and tooling to enable safe DB operations.
Top Skills:
AnsibleAuroraAws RdsChefDatadogDynamoDBElasticacheGoGrafanaIndexingMvccPatroniPgbouncerPostgresPrometheusPythonQuery PlannerReplicationRubySQLTerraformVacuum TuningWal
Artificial Intelligence • Healthtech • Software • Telehealth
The Senior Site Reliability Engineer will manage and evolve healthcare infrastructure, ensuring system resilience, compliance, and leading AI-driven operations for efficient healthcare delivery.
Top Skills:
AWSBashDatadogEc2EksGithub ActionsGoKubernetesPythonRdsRubyS3SemaphoreTerraform
Healthtech
Develop and implement processes to ensure high availability and reliability of services. Responsibilities include incident management, automation, capacity planning, and risk mitigation.
Top Skills:
AWSAzureDatadogDockerGrafanaJavaScriptNew RelicPrometheusPythonRubySplunkTerraform
Internet of Things • Cybersecurity
The Site Reliability Engineer will manage AWS GovCloud infrastructure, ensuring compliance and high availability while driving automation, security, and incident response best practices.
Top Skills:
AnsibleAws GovcloudBashDockerElk StackGitlab Ci/CdGrafanaJenkinsKubernetesPrometheusPythonTerraform
Information Technology • Business Intelligence • Consulting
The SRE role involves designing and operating reliable systems, building CI/CD pipelines, and enhancing developer experience in a cloudnative environment.
Top Skills:
App InsightsArgo CdC#DockerDynatraceFluxHelmKmsKubernetesKustoKustomizeNode.jsOpentelemetryPythonTerraformVault
Other
The Senior Site Reliability Engineer at Juul Labs ensures operational stability and performance of hybrid cloud infrastructure, leads automation, and handles critical incidents.
Top Skills:
AWSBashCloudFormationGCPNutanixPowershellPythonTerraform
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
eCommerce • Retail • Software
The Senior Database Reliability Engineer ensures database availability, reliability, and efficiency, driving initiatives for upgrades, automation, and security while mentoring team members.
Top Skills:
AWSDynamoDBElasticsearchMongoDBMySQLPostgresPowershellPythonRedisSQL Server
Artificial Intelligence • Cloud • Information Technology • Software
Build and operate production-grade AI infrastructure using Kubernetes, ensuring high availability, reliability, and performance. Develop custom operators and implement automation for efficient operations and monitoring.
Top Skills:
AnsibleBashElk StackEnterprise Storage SystemsGrafanaHigh-Performance NetworkingKubernetesLinuxNvidia Gpu TechnologiesPrometheusPythonTerraform
Healthtech • Software
The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.
Top Skills:
Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty
Food
The Reliability Engineer will manage maintenance of fixed assets, focusing on equipment reliability, predictive maintenance, and collaboration to reduce downtime and improve performance metrics of packaging operations.
Top Skills:
Automation EquipmentThermoforming Packaging MachinesTpm
Automotive
As an SRE, you'll enhance monitoring platforms, improve system reliability, manage AI applications, and optimize cloud resources while participating in troubleshooting and preventative measures.
Top Skills:
AIDynatraceGCPKubernetesPythonTerraform
Fitness • Healthtech • Information Technology • Payments • Software
As the SRE Manager, you'll lead a team focused on the reliability and performance of production systems, ensuring operational stability, managing team staffing, and coordinating incident responses while partnering with engineering leaders.
Top Skills:
AWSAzureBashChefDockerDynosElasticsearchF5Gitlab CiInstanaJenkinsKubernetesLinuxLogicmonitorNginxOpentelemetryOpsgeniePagerdutyPHPPythonRabbitMQRedisRubyTerraformTraefikVMware
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills:
Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will enhance cloud product lines, ensuring real-time scalability, collaborating with teams, and automating builds.
Top Skills:
AnsibleAWSAzureBashDnsDockerEnvoyGCPGitGoGrafanaHaproxyHTTPJenkinsKafkaKubernetesLinuxMySQLOciOpentelemetryPostgresPrometheusPuppetPythonRedisTcp/IpTelegrafTerraformTls
Security • Software
The Senior Site Reliability Engineer will ensure system reliability, implement automation, monitor performance, and collaborate on service objectives and incident responses.
Top Skills:
AWSCircleCIGCPGithub ActionsGoGrafanaKubernetesPrometheusPythonTerraform
Software
As a Site Reliability Engineer, you will enhance system reliability, manage cloud services, respond to incidents, and support network systems.
Top Skills:
AutomationCisco RoutingCloud ServicesF5 Load BalancingFortinet FirewallsInfrastructure AutomationMonitoringNetworking
Cloud • Security • Cybersecurity
As a Junior Site Reliability Engineer, you will support cloud operations, implement automation for cloud infrastructure, and ensure system reliability and security.
Top Skills:
AnsibleAWSAzureBashElastic StackGCPJIRAPowershellPythonServicenowSplunkTerraform
Healthtech • Information Technology • Telehealth
Lead Site Reliability Engineer responsible for ensuring cloud services reliability, automation, and performance while mentoring a team and collaborating cross-functionally. Drive initiatives to enhance incident management and enforce security compliance.
Top Skills:
AnsibleAWSAws CloudformationAzureBashDatadogDockerElk StackGoGCPGrafanaKubernetesPrometheusPuppetPythonTerraform
Big Data • Cloud • Software • Analytics
As a Site Reliability Engineering Intern, you'll monitor cloud services, assist in incident management, support automation, and collaborate with engineers to improve system reliability.
Top Skills:
AWSAzureDatabasesGCPGitGrafanaKafkaLinux/Unix SystemsPrometheusPythonShell ScriptingTerraform
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Chicago, IL Companies Hiring Remote Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results




.png)
















.png)












