As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure, and reducing work through automation. You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you’ll be focused on running better production applications and systems.
- Design, code, test, and deliver software to automate manual operational work
- Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
- Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
- Identify application patterns and analytics in support of better service level objectives
- Design self-healing and resiliency patterns
- Design automated software and product upgrades, change management, and release management solutions
- Coach or manage teams as applicable
- Participate in the 24x7 support coverage as needed
- Bachelor’s degree or equivalent experience in an software engineering discipline
- Expertise in at least one technology stack designing, coding, testing, and delivering software
- Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
- Working knowledge of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
- Excellent debugging and trouble shooting skills
- 7+ years of Software Engineering experience, using one or more object oriented programming languages and/or scripting.
- 5+ years experience in two or more of the following areas (encourage applying to this role if you have good core experience in at least 2 areas):
- Hand-on experience with cloud-based applications, technologies and tools, deployment, monitoring and operations, such as Kubernetes, Prometheus, FluentD, Slack, Elasticsearch, Grafana, Kibana, etc.
- Relational and NoSQL databases; developing and managing operations leveraging key event streaming, messaging and DB services such as Cassandra, MQ/JMS/Kafka, Aurora, RDS, Cloud SQL, BigTable, DynamoDB, MongoDB, Cloud Spanner, Kinesis, Cloud Pub/Sub, etc.
- Networking (Security, Load Balancing, Network Routing Protocols, etc.)
- Developing monitoring tools and log analysis tools to manage operations
Chase is a leading financial services firm, helping nearly half of America’s households and small businesses achieve their financial goals through a broad range of financial products. Our mission is to create engaged, lifelong relationships and put our customers at the heart of everything we do. We also help small businesses, nonprofits and cities grow, delivering solutions to solve all their financial needs.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. In accordance with applicable law, we make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as any mental health or physical disability needs.
Equal Opportunity Employer/Disability/Veterans
About the Team