Production Systems Engineer
You take pride in your work, helping others learn and you look for the same from everyone on your team. Your past experience shows an eye for detail and a desire to talk about that new thing you just learned or that new skill you picked up.
You feel passionate about automating repetitive tasks. You don’t just think “there has to be a better way”--you find the better way and can’t help but show it off to your teammates.
This role is responsible for managing the overall health of our production environment running on Google Cloud Platform You’ll be maintaining our Kubernetes clusters and GCE instances running all of our sites and services along with the routing and processes that glue them all together.
- Performing day-to-day operational tasks on public facing infrastructure (keep existing things running and get new things going).
- Ownership of configuration management and deployment tools.
- Assisting in the architectural design of new services and making them operate at scale.
- Monitoring and analysis of systems, services and service clusters, optimization of performance and resource utilization.
- Tracing and troubleshooting misbehaving servers or services and assist with diagnosis or resolution.
- Assisting in or lead incident response, diagnosis and follow-up on system outages or alerts.
- Maintaining up-to-date supporting systems and platforms. Provide recommendations on needed upgrades or migrations.
- Balancing security and risk assessment with business needs and processes.
- Building awesome tools and processes that help us achieve more together.
- 3+ years experience in an SRE/Operations/DevOps role as part of a team.
- Experience managing high traffic customer facing websites and servers.
- Familiarity with Open Source configuration management and orchestration tools .
- Comfortable with shell and scripting languages in relation to the role (Python and Bash required, any others are welcome).
- Experience with Google Cloud Platform including networking, IAM, provisioning, and monitoring/tracing.
- Experience with software and service deployments built on Docker and Kubernetes.
- Desire to automate tasks and assume ownership of production infrastructure.