BrightHire Logo

BrightHire

Senior Site Reliability Engineer

Reposted 6 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The Senior Site Reliability Engineer will ensure the reliability and performance of critical systems by improving observability, database performance, Kubernetes management, and CI/CD pipelines, while enhancing developer experience and infrastructure.
The summary above was generated by AI

BrightHire is a category-creating, high-growth, Series B software company with a mission to give everyone the hiring experience they deserve.

We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—Canva, OpenAI, Ramp, Hubspot—up to the Fortune 500.

Location

Remote - USA

About the Role

You will own the end-to-end reliability and performance of many of our most critical systems. Working in lockstep with Product and Engineering, you will design, build, and refine the platform that our application and AI features run on, from Kubernetes and databases through CI/CD and observability. You will focus on keeping our systems fast, reliable, and easy for developers to work with. You will work on real infrastructure that supports features people use every day—things like:

  • Continuing to improve and iterate on our observability stack that includes Kibana, Grafana, OTel, and Elastic.
  • Database performance improvements by analyzing slow and high-volume queries, tuning indexes, optimizing query patterns and timing, and recommending schema and code changes to keep QPS and latency low.
  • Kubernetes improvements and upgrades, including deploying new services, improving resource utilization, tightening security, and standardizing deployment patterns across teams.
  • Improving CI/CD pipelines for both backend and frontend services so engineers can ship quickly and safely, with clear feedback loops, fast build times, and reliable rollbacks.
  • Enhancing the local developer experience so that running and debugging the app locally feels fast, consistent, and representative of production.
  • Helping improve our CI/CD and observability for our ML pipeline and models, bringing MLOps best practices into our existing infrastructure.
What You’ll Bring
  • You have real-world experience running production systems and doing SRE, Platform, or DevOps work for web applications or APIs.
  • You are comfortable working across Kubernetes, CI/CD, databases, and backend services, and you enjoy owning problems end to end.
  • You have strong experience with Kubernetes in production environments, including cluster upgrades, workload deployments, scaling, and debugging.
  • You have experience with observability stacks (such as Elasticsearch and Kibana, Prometheus, Grafana, or similar) and can lead efforts like upgrading Kibana to new major versions and improving logs, metrics, and dashboards.
  • You have worked deeply with relational databases and SQL, know how to profile slow queries, design and tune indexes, and work with engineers to adjust query patterns, timing, and frequency to improve performance.
  • You are comfortable in at least one backend language (i.e. Python) and can read and modify application code to support infra and performance improvements.
  • You have experience improving CI/CD pipelines, including build and test speed, deployment workflows, and release strategies (such as blue/green or canary).
  • You have worked with infrastructure-as-code tools or similar patterns to manage environments in a repeatable way.
  • You think deeply about developer experience and reliability and use both metrics and empathy to guide your decisions.
  • You care about security, resiliency, and cost as integral aspects of the systems you build and manage.
  • You move fast and independently, but you know when to pull in teammates for pairing, reviews, or cross-team alignment.
About our team
  • You’ll have the opportunity to work on high-impact projects in small, autonomous squads, with the flexibility to lead initiatives or focus as an individual contributor depending on your goals and interests.
  • Our developer experience is thoughtfully designed, with fast CI (< 10 minutes), 1-click deploys, strong observability, and a clean codebase that enables you to move quickly and confidently.
  • Our culture supports sustainable, focused work with fully remote roles, regular working hours, no-meeting Wednesdays, and flexible time off to recharge when needed.
  • Our team is composed of smart, collaborative, and genuinely kind people, creating an environment where you can learn, grow, and do your best work.
Equal Employment Opportunity (EEO) Statement

Our company does not discriminate in employment on the basis of race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor.

*Note to Recruiters and Placement Agencies: We do not accept unsolicited agency resumes. Please do not forward unsolicited agency resumes to our website. We will not pay fees to any third party agency or firm and will not be responsible for any agency fees associated with unsolicited resumes. Unsolicited resumes received will be considered our property.
 

Top Skills

Ci/Cd
Elasticsearch
Grafana
Kibana
Kubernetes
Prometheus
Python
SQL

Similar Jobs

8 Hours Ago
Remote or Hybrid
147K-278K Annually
Senior level
147K-278K Annually
Senior level
Cloud • Software
Responsible for maintaining FedRAMP compliant services, designing infrastructure, monitoring systems, and ensuring security for federal regions, while driving automation and collaboration with development teams.
Top Skills: AWSFedrampGoKubernetesPuppetPythonTerraformUnix/Linux
5 Days Ago
In-Office or Remote
92K-164K Annually
Senior level
92K-164K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Site Reliability Engineer will ensure system reliability and performance, automate processes, and collaborate with dev teams, focusing on AWS infrastructure and incident management.
Top Skills: AWSAws CloudformationCdkCloudwatchDynatraceGitGitlabLinuxPowershellPythonTerraform
13 Days Ago
Easy Apply
Remote
USA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud infrastructure, managing CI/CD workflows, and ensuring operational excellence in IT support, including incident response and security practices.
Top Skills: AnsibleAWSBashDockerGitKubernetesPythonRubyTerraform

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account