Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in Chicago, IL
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Staff Site Reliability Engineer will lead reliability strategies, manage high-risk initiatives, and enhance engineering standards while ensuring system reliability and operational excellence within a hybrid work environment.
Top Skills:
BashCi/CdDatabase ArchitectureGoGoogle Cloud PlatformInfrastructure-As-CodeKubernetesMonitoring PlatformsPulumiPythonTerraform
Reposted 12 Hours AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Analytics • Biotech • Generative AI
Join the SRE team to design, deploy, and operate resilient cloud infrastructure. Recommend solutions, automate workflows, configure Terraform and CI, implement monitoring and alerts, and support developers and users.
Top Skills:
AnsibleAurora MysqlAWSAzureBashChefCloudFormationComposerConcourseDataprocDockerGCPGoHipaaHitrustIsoKubernetesPackerPostgresPuppetPythonRubySaltSlackTerraform
Reposted 7 Days AgoSaved
Easy Apply
Easy Apply
AdTech
As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.
Top Skills:
Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform
Reposted 12 Hours AgoSaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills:
Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills:
Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.
Top Skills:
Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform
Reposted 9 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Reposted 4 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills:
AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Digital Media • Information Technology • News + Entertainment
Responsible for ensuring reliability, scalability, and performance of data platforms. Design monitoring and alerting, automate deployments and recovery, optimize storage and query performance, troubleshoot incidents, plan capacity and scaling, document operations, enforce security/compliance, and collaborate with data engineering, product, and data science teams to maintain high availability of large-scale data systems.
Top Skills:
AnsibleAWSAzureCi/CdDockerElk StackGCPGoGrafanaJavaKubernetesMySQLNoSQLPostgresPrometheusPythonScalaTerraform
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Fintech • Payments • Financial Services
The Site Reliability Engineer will automate processes, manage server deployments, and collaborate with teams to enhance operational workflows in a trading environment.
Top Skills:
AnsibleC++ChefCloud InfrastructureDistributed SystemsDockerGoGrafanaHashicorp NomadHpc ClustersKubernetesLinuxPerlPodmanPrometheusPuppetPythonRancherRustSalt
Reposted 19 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills:
AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 19 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Security • Software • Cybersecurity • Automation
As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.
Top Skills:
AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills:
AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Fintech
Lead SRE work partnering with development teams to design and implement availability, scalability, observability, and automation for production systems. Build tooling, manage incident response and RCAs, optimize capacity and performance, mentor engineers, maintain runbooks, and participate in a 24x7 on-call rotation.
Top Skills:
AuroraAWSChefCi/CdDockerDynamoDBGitGoIpJavaJavaScriptJenkinsJmsKafkaKubernetesLinuxMavenMemcachedMicroservicesObservabilityOraclePythonRedisRubySqsSwarmTcpUdp
Financial Services
Design, build, and operate reliable cloud infrastructure and networking (multi-account AWS, VPC, IAM). Implement IaC, CI/CD pipelines, observability (logging/metrics/alerting), automation, and reliability guardrails. Provide production support and incident response, perform root cause analysis, and collaborate with application teams to co-own system design and continuous improvement, using AI-assisted tools where appropriate.
Top Skills:
.NetAi-Assisted Tools (Claude CodeAWSAws OrganizationsBashCi/CdCloudFormationElastic StackGitGithub CopilotIamInfrastructure As CodeJavaJenkinsNode.jsObservabilityOpensearchPowershellPythonTerraformVpcWindsurf)
Reposted 22 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills:
AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
Information Technology • Software
Seek an SRE/Network Engineer with deep MAAS and bare-metal automation expertise to manage hundreds of nodes across distributed sites. Responsibilities include Linux administration, hardware-level diagnostics (BIOS/IPMI/RAID), network design (VLANs/L2-L3/VPN/UniFi), infrastructure automation (Ansible, Bash/Python, Git), observability (Prometheus/Grafana, ELK/Graylog/Loki), PXE/MAAS-based OS provisioning, API integrations, virtualization (OpenStack/Kolla-Ansible, Proxmox, VMware), and container workload support.
Top Skills:
AnsibleBashBiosCloud-InitCloudflare ApiDebianElkGitGitopsGrafanaGraylogIpmiIronicKolla-AnsibleL2 RoutingL3 RoutingLinuxLokiMaasOpenstackPreseedPrometheusProxmox VePxePythonRaidUbuntuUnifiVlanVmware EsxiVpn
Artificial Intelligence • Cloud • Information Technology • Legal Tech • Productivity • Software
The Senior Site Reliability Engineer will focus on automating infrastructure, enhancing cloud resilience, supporting deployments, and mentoring teams in reliability best practices, while participating in on-call rotations.
Top Skills:
AzureBashCi/CdDockerGoGrafanaJavaKubernetesPowershellPrometheusPythonRubyTerraform
Artificial Intelligence • Cloud • Information Technology • Mobile • Software • Consulting
The role involves designing and implementing OpenTelemetry solutions, optimizing telemetry infrastructure, establishing SRE practices, and managing observability across cloud platforms.
Top Skills:
ArgocdAWSAzureBashCloudFormationDockerGCPGithub ActionsGitlab CiGoJavaJenkinsNode.jsOpentelemetryPowershellPulumiPythonRustTerraform
24 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills:
AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills:
Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
Hardware
Lead technical services engineer guiding and training engineers, designing IT architecture, troubleshooting network security and third-party control integrations, coordinating projects, providing customer training and field support, and managing personnel and resources.
Top Skills:
802.1XAmxCrestronExcelMicrosoft OutlookMicrosoft PowerpointMicrosoft WordRadiusSecurity Certificate Management
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Chicago, IL Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs in Chicago
.NET Developer Jobs in Chicago
Android Developer Jobs in Chicago
Application Engineer Jobs in Chicago
Artificial Intelligence Engineer Jobs in Chicago
Backend Engineer Jobs in Chicago
C# Jobs in Chicago
C++ Jobs in Chicago
Devops Engineer Jobs in Chicago
DevOps Jobs in Chicago
Director Of Software Engineering Jobs in Chicago
Electrical Engineering Jobs in Chicago
Engineering Jobs in Chicago
Engineering Manager Jobs in Chicago
Enterprise Architect Jobs in Chicago
Fpga Engineer Jobs in Chicago
Front-End Developer Jobs in Chicago
Full-Stack Engineer Jobs in Chicago
Golang Jobs in Chicago
Hardware Engineer Jobs in Chicago
Infrastructure Engineer Jobs in Chicago
iOS Developer Jobs in Chicago
Java Developer Jobs in Chicago
Java Full-Stack Engineer Jobs in Chicago
Javascript Jobs in Chicago
Lead Software Engineer Jobs in Chicago
Linux Jobs in Chicago
Perl Jobs in Chicago
PHP Developer Jobs in Chicago
Platform Engineer Jobs in Chicago
Principal Engineer Jobs in Chicago
Principal Software Engineer Jobs in Chicago
Project Engineer Jobs in Chicago
Python Jobs in Chicago
QA Engineer Jobs in Chicago
Reliability Engineer Jobs in Chicago
Ruby Jobs in Chicago
Sales Engineer Jobs in Chicago
Salesforce Developer Jobs in Chicago
Scala Jobs in Chicago
Senior Android Engineer Jobs in Chicago
Senior Devops Engineer Jobs in Chicago
Senior Engineer Jobs in Chicago
Senior Front-End Engineer Jobs in Chicago
Senior Full-Stack Engineer Jobs in Chicago
Senior Java Engineer Jobs in Chicago
Senior Network Engineer Jobs in Chicago
Senior Platform Engineer Jobs in Chicago
Senior Site Reliability Engineer Jobs in Chicago
Senior Software Architect Jobs in Chicago
Senior Solutions Architect Jobs in Chicago
Senior Systems Engineer Jobs in Chicago
Software Engineering Manager Jobs in Chicago
Software Test Engineer Jobs in Chicago
Solutions Architect Jobs in Chicago
Solutions Engineer Jobs in Chicago
Staff Engineer Jobs in Chicago
Staff Software Engineer Jobs in Chicago
Systems Engineer Jobs in Chicago
Web Developer Jobs in Chicago
All Filters
Total selected ()
No Results
No Results
















.png)











