Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Chicago, IL
Cloud
The role involves building and managing observability infrastructure in GCP, automating deployments, and optimizing data processes for high reliability.
Top Skills:
GkeGoGCPGrafanaKubernetesOpentelemetryPythonRubySplunkTerraform
Cloud • Security • Software
As a Site Reliability Engineer, you will design, deliver, and maintain cloud-based infrastructure, ensuring resilient and secure enterprise software solutions through optimized CI/CD processes.
Top Skills:
Ci/CdDockerGCPGitGoKubernetes
Fitness • Healthtech • Information Technology • Payments • Software
The Site Reliability Engineer will enhance system reliability, manage cloud infrastructure, automate processes, support CI/CD pipelines, and troubleshoot production issues.
Top Skills:
AnsibleAWSBashChefDockerGitGitlabJenkinsKubernetesMySQLPostgresPythonSQL ServerTerraformVMware
Software
Lead SRE to define SRE strategy, architecture, and roadmap; design and operate containerized, compliant cloud environments; build observability, incident management, automation, and developer platform capabilities; mentor SRE team and collaborate with security, compliance, and product teams to ensure reliability at scale.
Top Skills:
AWSAws MarketplaceAzureAzure MarketplaceGCPGoogle Cloud MarketplaceGrafanaKubernetesPrometheusTerraform
Artificial Intelligence • Insurance • Software • Automation
The Staff Site Reliability Engineer will build and scale infrastructure for Assured's platform, automate delivery, enhance observability, and lead mentoring initiatives.
Top Skills:
AWSKubernetesPostgresTerraform
Artificial Intelligence • Other • Sales • Software
The role involves designing and advancing infrastructure for the engineering team, ensuring the reliability of Kubernetes clusters, automating operations, and building machine learning infrastructure.
Top Skills:
ArgoAWSAzureCloudFormationFluxGithub ActionsGoGCPKubernetesPostgresPythonTerraform
Agency • Information Technology
Lead SRE role designing and maintaining CI/CD pipelines (GitHub Actions), containerized deployments (Docker, Kubernetes, AKS, Helm), web/mobile app releases, observability, automated testing, and DevOps best practices across cloud environments with cross-functional collaboration and regulatory compliance.
Top Skills:
AksAndroidAzure Application InsightsAzure Log AnalyticsAzure MonitorBashBranchingDockerDocker ComposeGitGit HooksGithub ActionsGoogle PlayHelmHerokuiOSIos App StoreJavaKubernetesNpmPowershellPull RequestsPythonSonarqubeVeracodeVercel
Digital Media • Social Media • Software • Sports
Lead the technical architecture and execution of migration to AWS, drive developer enablement, and automate infrastructure using code-first principles.
Top Skills:
Aws EksDatadogGithub ActionsGoIstioK6KubernetesNode.jsTerraform
Computer Vision • Machine Learning • Software
As a Site Reliability Engineer, ensure the reliability, performance, and scalability of Ditto's cloud infrastructure by developing observability solutions, leading incident management, and collaborating with product engineering teams.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Artificial Intelligence • Fintech • Machine Learning • Natural Language Processing • Business Intelligence
Lead architecture and implementation of reliability platforms and SRE practices for a production SaaS. Build self-service reliability tooling, drive AIOps automation, advance observability (monitoring, tracing, profiling), lead incident response and postmortems, mentor engineers, and embed production readiness across teams to achieve 99.99% uptime.
Top Skills:
AWSAzureContinuous ProfilingDatadogDnsElkGCPGoGrafanaHttp/SKubernetesLoad BalancingOpentelemetryPrometheusPythonTcp/Ip
Legal Tech • Software
Lead Site Reliability Engineer responsible for platform availability and reliability of RelativityOne. Drive SRE best practices, build tools, lead projects, coach SREs, work with stakeholders, support incidents, run postmortems, and improve monitoring, automation, and operational efficiency.
Top Skills:
Ci/CdDevOpsJenkinsJIRAKubernetesAzureMonitoring And AlertingNew RelicNoSQLPowershellRelativity ServerRelativityoneSQLTableau
Other
As a Site Reliability Engineer, you will design cloud platforms, automate operations, maintain infrastructure, and support engineering teams in delivering reliable services.
Top Skills:
AnsibleAWSAzureBashCircleCICloudFormationDatadogDnsDockerGitlab CiGoGCPGrafanaHTTPHttpsJenkinsKubernetesKvmLinuxPerlPrometheusPythonRubyTcp/IpTerraformUnixVMware
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Healthtech • Other • Software
As a Senior Database Site Reliability Engineer, you'll design, implement, and maintain PostgreSQL systems, ensure reliability, automate maintenance tasks, and participate in incident response.
Top Skills:
AnsibleBashDatadogGrafanaNew RelicPostgresPowershellPrometheusPythonTerraform
Software • Financial Services
Ensure platform reliability, performance, and availability by implementing observability, automating infrastructure, participating in on-call rotations and post-mortems, partnering with Product and Engineering, designing scalable architectures, mentoring teammates, and integrating Dynatrace with Azure DevOps and Jira while supporting compliance (SOC/FedRAMP).
Top Skills:
.NetAksAlpineAnsibleAppinsightsArm TemplatesAWSAzure DevopsBashBicepC#ChefCloudFormationDatadogDebianDynatraceEksGCPGitGitGksGrafanaHelmJIRAKubernetesLog AnalyticsAzureNew RelicOnestream SoftwareOpenshiftPowershellPowershell DscPrometheusPuppetPythonRest ApisSQLTerraformUbuntu
Fintech • Information Technology
As a Site Reliability Engineer at Alpaca, you will ensure system reliability and performance, troubleshoot issues, and collaborate with teams to design scalable features.
Top Skills:
GoGormLinuxPgxPostgresPrometheusSqlc
Gaming • Software
The Site Reliability Engineer will manage infrastructure stability and scalability, lead cloud migrations, and optimize performance across systems while mentoring team members.
Top Skills:
AnsibleAWSAzureBashChefCloudFormationDatadogDockerElk StackGCPGoGrafanaKubernetesPrometheusPuppetPythonTerraformUnix/Linux
Artificial Intelligence • Cloud • Information Technology • Software • Big Data Analytics
Founding Staff SRE for Volcano: define SLOs/error budgets, architect multi-region Kubernetes infrastructure, build GitOps/CI-CD with ArgoCD/Helm/Terraform, scale managed Postgres/Redis/object storage, implement observability with Datadog/Prometheus/Grafana, lead incident response and SRE culture, and mentor cross-functional teams.
Top Skills:
ArgocdCanary DeploymentsCi/CdCniDatadogGitopsGrafanaHelmIngressKubernetesObject StoragePostgresPrometheusRedisService MeshTerraformTerragrunt
Software
As a Site Reliability Engineer, you'll enhance system reliability, collaborate on production readiness, define SLIs/SLOs, and improve incident response.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Healthtech • Software
The SRE Technical Project Manager will lead project delivery, incident management, automation processes, and uptime communication, partnering with SRE and development teams to ensure system stability and scalability.
Top Skills:
Ai BotsDatadogJIRAJira Service ManagementMs TeamsOpsgeniePagerduty
Real Estate • Financial Services • PropTech
Support and optimize products migrated to AWS, implement cloud best practices, maintain operational coverage, enhance automation, observability, CI/CD/GitOps, and security. Collaborate with development and platform teams to scale, troubleshoot, and ensure reliable SaaS operations.
Top Skills:
AmisArgocdAWSAws Elastic BeanstalkAws Transfer FamilyAzure DevopsBashCloudwatchCurlDockerEc2EksFluxcdGitGitopsHTTPIstioKubernetesLinkerdLoad BalancerPowershellPythonRdsSQLTerraformWget
Cloud
The Site Reliability Engineer will manage Kubernetes platforms, optimize AWS cloud infrastructure, ensure high availability, and automate deployment while handling troubleshooting and security compliance.
Top Skills:
AWSBashCi/CdCloudwatchElk StackGoGrafanaHelmIstioKubernetesPrometheusPythonTerraform
Cloud
The Senior Site Reliability Engineer will enhance the Splunk ecosystem and develop an Observability Platform by automating infrastructure and managing complex distributed systems, while optimizing log collection and incident response.
Top Skills:
AWSGCPGoKubernetesLinuxOpentelemetryPythonRubySplunkTerraform
eCommerce
Ensure reliability and availability of Tradeweb's global AWS platform through IaC automation, observability and SLO definition, incident triage and resolution, on-call duties, collaboration with development teams, and security-focused platform improvements.
Top Skills:
ArgocdAWSAws LambdaEksGitsecopsInfrastructure As Code (Iac)Kubernetes (K8S)KustomizeLgtmLinux/UnixPulumiPythonSmsSns
Fintech
Responsible for enhancing application infrastructure, ensuring reliability and scalability, automating processes, implementing observability, and collaborating with software development teams.
Top Skills:
AWSDockerGitGoJavaJavaScriptKubernetesLinuxPythonRubySwarm
Fintech
Design, build, and maintain scalable, reliable application infrastructure. Automate deployments and configuration, implement observability and monitoring, troubleshoot performance, advise development teams on SDLC and microservice best practices, create runbooks, participate in 24x7 on-call rotation, and ensure security and disaster recovery readiness.
Top Skills:
AWSCi/CdDockerGitGoIpJavaJavaScriptKubernetesLinuxMonitoringObservabilityPythonRubyScripting LanguagesSecurity Encryption ProtocolsSwarmTcpUdp
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Chicago, IL Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs in Chicago
.NET Developer Jobs in Chicago
Android Developer Jobs in Chicago
Application Engineer Jobs in Chicago
Artificial Intelligence Engineer Jobs in Chicago
Backend Engineer Jobs in Chicago
C# Jobs in Chicago
C++ Jobs in Chicago
Devops Engineer Jobs in Chicago
DevOps Jobs in Chicago
Director Of Software Engineering Jobs in Chicago
Electrical Engineering Jobs in Chicago
Engineering Jobs in Chicago
Engineering Manager Jobs in Chicago
Enterprise Architect Jobs in Chicago
Fpga Engineer Jobs in Chicago
Front-End Developer Jobs in Chicago
Full-Stack Engineer Jobs in Chicago
Golang Jobs in Chicago
Hardware Engineer Jobs in Chicago
Infrastructure Engineer Jobs in Chicago
iOS Developer Jobs in Chicago
Java Developer Jobs in Chicago
Java Full-Stack Engineer Jobs in Chicago
Javascript Jobs in Chicago
Lead Software Engineer Jobs in Chicago
Linux Jobs in Chicago
Perl Jobs in Chicago
PHP Developer Jobs in Chicago
Platform Engineer Jobs in Chicago
Principal Engineer Jobs in Chicago
Principal Software Engineer Jobs in Chicago
Project Engineer Jobs in Chicago
Python Jobs in Chicago
QA Engineer Jobs in Chicago
Reliability Engineer Jobs in Chicago
Ruby Jobs in Chicago
Sales Engineer Jobs in Chicago
Salesforce Developer Jobs in Chicago
Scala Jobs in Chicago
Senior Android Engineer Jobs in Chicago
Senior Devops Engineer Jobs in Chicago
Senior Engineer Jobs in Chicago
Senior Front-End Engineer Jobs in Chicago
Senior Full-Stack Engineer Jobs in Chicago
Senior Java Engineer Jobs in Chicago
Senior Network Engineer Jobs in Chicago
Senior Platform Engineer Jobs in Chicago
Senior Site Reliability Engineer Jobs in Chicago
Senior Software Architect Jobs in Chicago
Senior Solutions Architect Jobs in Chicago
Senior Systems Engineer Jobs in Chicago
Software Engineering Manager Jobs in Chicago
Software Test Engineer Jobs in Chicago
Solutions Architect Jobs in Chicago
Solutions Engineer Jobs in Chicago
Staff Engineer Jobs in Chicago
Staff Software Engineer Jobs in Chicago
Systems Engineer Jobs in Chicago
Web Developer Jobs in Chicago
All Filters
Total selected ()
No Results
No Results










.png)






















