Yahoo Jobs

Senior Software Engineer - Storage Infrastructure

Yahoo

Senior Software Engineer - Storage Infrastructure

Reposted Yesterday

Remote

Hiring Remotely in United States of America

128K-267K Annually

Senior level

Remote

Hiring Remotely in United States of America

128K-267K Annually

Senior level

Design and optimize storage systems at scale, implementing caching strategies and ensuring data availability and performance for a large user base.

The summary above was generated by AI

Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through our portfolio of iconic products. For advertisers, Yahoo Advertising offers omnichannel solutions and powerful data to engage with our brands and deliver results.

About the Team

Our platform is the foundational identity and data layer for 900M+ monthly active users, serving 2.5B+ profiles at massive scale. We are building a predictive, identity-centric insights engine—ensuring our audience is understood with precision to deliver hyper-personalized experiences and advertising solutions across all our digital properties.

Our mission centers on first-party data strategy: capturing, enriching, and activating audience signals to build a 360-degree view of every user. We operate under a Privacy-by-Design philosophy, adhering to global regulations (GDPR, CCPA) and industry security standards, while leveraging a cloud-native stack across GCP (BigQuery, Spanner, Dataflow, Composer, GKE) and AWS, with modern MLOps practices to deliver measurable business impact.

About the Role

As a Senior Software Engineer, you will design and optimize the foundational storage layer powering our 2.5B+ profile dataset. Your work on Cloud Spanner schema design, Valkey (Redis-compatible) caching strategies, and multi-region replication ensures sub-10ms data access for APIs serving millions of requests per second, directly enabling hundreds of millions in annual advertising revenue.

You will build and maintain petabyte-scale storage infrastructure with 99.99% availability, implementing efficient read/write patterns, multi-region replication, and disaster recovery mechanisms. Your expertise in distributed databases and caching systems is critical to balancing performance, cost, and reliability at massive scale.

This role demands deep knowledge of Cloud Spanner internals, distributed caching architectures, and production database operations at scale. You will collaborate closely with API, Ingestion, and SRE teams to ensure optimal data access patterns while maintaining data durability and system reliability.

Key Responsibilities

Design and optimize Cloud Spanner schemas for efficient profile storage, query patterns, and write throughput at 2.5B+ profile scale
Implement Valkey (Redis-compatible) caching strategies achieving sub-10ms read latency for hot data access patterns
Build multi-region Spanner replication and automated failover mechanisms ensuring 99.99% availability and disaster recovery
Optimize Spanner read/write throughput, reduce hot-spotting, and improve query performance through index design and query optimization
Implement comprehensive monitoring and alerting systems tracking storage health, latency percentiles (p50, p95, p99), capacity utilization, and cost
Collaborate with API team on efficient data access patterns, query optimization, and caching strategies for activation endpoints
Partner with Ingestion team on high-throughput write patterns, batch loading strategies, and schema evolution without downtime
Design backup, point-in-time recovery, and disaster recovery procedures for critical user profile data
Troubleshoot production storage issues including performance degradation, hot-spotting, lock contention, and capacity constraints
Work with SRE teams on capacity planning, autoscaling strategies, cost optimization, and infrastructure efficiency
Implement cache invalidation strategies, cache warming, and distributed caching patterns for consistent data access
Create comprehensive documentation for storage architecture, operational runbooks, disaster recovery procedures, and on-call playbooks

Required QualificationsEducation

Bachelor's degree in Computer Science, Engineering, or related technical field

Experience

5+ years software engineering experience building production systems
3+ years hands-on experience with distributed databases or large-scale storage systems
2+ years with GCP infrastructure (Spanner, Memorystore, Cloud Monitoring) or AWS equivalents (DynamoDB, ElastiCache)

Technical Skills

Strong proficiency in Java, Go, or Python for infrastructure and database tooling development
Hands-on experience with Cloud Spanner, CockroachDB, TiDB, or other distributed SQL databases
Experience with Redis, Valkey, Memcached, or other distributed caching systems in production
Deep understanding of distributed systems: consistency models (strong vs. eventual), replication strategies, consensus algorithms (Paxos, Raft)
SQL optimization skills and database schema design expertise including indexing strategies, partitioning, and query tuning
Familiarity with database performance tuning: profiling slow queries, analyzing execution plans, optimizing hot-spotting

Competencies

Strong performance tuning and troubleshooting abilities in distributed database environments
Demonstrated ability delivering reliable infrastructure solutions on schedule with minimal guidance
Excellent collaboration with infrastructure, application, and SRE teams
Team-level impact with ability to influence technical decisions within immediate team
Understanding of data durability, consistency guarantees, and operational excellence

Preferred Qualifications

Experience with multi-region Cloud Spanner deployments at petabyte scale
Knowledge of cache invalidation strategies, cache coherence protocols, and distributed caching patterns
Prior experience in large-scale user data platforms, identity systems, or adtech storage infrastructure
Familiarity with database migration tools (gh-ost, pt-online-schema-change) and zero-downtime schema evolution
Understanding of data partitioning strategies, sharding, horizontal scaling, and distributed transaction processing
Experience with database backup and recovery tools, point-in-time recovery, and disaster recovery testing
Contributions to database or distributed systems open-source projects (Spanner clients, Redis modules, CockroachDB)
Self-driven, detail-oriented, excellent multitasking abilities in fast-paced environments

The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies; exercising sound judgment; working effectively, safely and inclusively with others; exhibiting trustworthiness and meeting expectations; and safeguarding business operations and brand integrity.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don’t require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You’ll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Yahoo is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Yahoo will consider for employment qualified applicants with criminal histories in a manner consistent with applicable law. Yahoo is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form (www.yahooinc.com/careers/contact-us.html) or call +1.866.772.3182. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response.

We believe that a diverse and inclusive workplace strengthens Yahoo and deepens our relationships. When you support everyone to be their best selves, they spark discovery, innovation and creativity. Among other efforts, our 11 employee resource groups (ERGs) enhance a culture of belonging with programs, events and fellowship that help educate, support and create a workplace where all feel welcome.

The compensation for this position ranges from $128,250.00 - $266,875.00/yr and will vary depending on factors such as your location, skills and experience.The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much (much) more.

Currently work for Yahoo? Please apply on our internal career site.

Similar Jobs

Airwallex

Associate Director, Obligations Management

An Hour Ago

Remote or Hybrid

Expert/Leader

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI

Lead the global obligations management function: design and maintain a centralized obligations register, map legal and partner mandates to controls, manage RFI knowledge base and audit register, ensure traceability and remediation, partner with regional legal/compliance/audit teams, and scale the team and GRC tooling to replace manual trackers.

PwC

Managed Services - Data Quality Engineer - Senior Associate -

An Hour Ago

Remote or Hybrid

77K-202K Annually

Senior level

77K-202K Annually

Senior level

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI

Maintain data integrity and quality through advanced testing and validation of ETL pipelines. Analyze complex data issues, build solutions, mentor junior staff, engage with clients, and support continuous improvement across data management, governance, and pipeline orchestration.

Top Skills: Apache AirflowAWSAws GlueAzureETLInformatica Data Quality (Idq)PrefectPythonQlikSnowflakeSQL

PwC

IT Infrastructure Managed Services - Onshore Delivery Director

An Hour Ago

Remote or Hybrid

Chicago, IL, USA

155K-410K Annually

Senior level

155K-410K Annually

Senior level

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI

The IT Infrastructure Managed Services Director leads cloud and network architecture solutions, drives business growth, and mentors teams, ensuring exceptional service delivery and client satisfaction.

Top Skills: Cloud ArchitectureInfrastructure SolutionsNetwork Architecture

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory