WHO WE ARE
Zeta Global (NYSE: ZETA) is the AI-Powered Marketing Cloud that leverages advanced artificial intelligence (AI) and trillions of consumer signals to make it easier for marketers to acquire, grow, and retain customers more efficiently. Through the Zeta Marketing Platform (ZMP), our vision is to make sophisticated marketing simple by unifying identity, intelligence, and omnichannel activation into a single platform – powered by one of the industry’s largest proprietary databases and AI. Our enterprise customers across multiple verticals are empowered to personalize experiences with consumers at an individual level across every channel, delivering better results for marketing programs. Zeta was founded in 2007 by David A. Steinberg and John Sculley and is headquartered in New York City with offices around the world. To learn more, go to www.zetaglobal.com.
The Opportunity
We are looking for a Staff Data Engineer to lead the design and implementation of a unified semantic data layer that spans all of Zeta’s data sources—both data at rest and data in motion. This role sits at the intersection of data engineering, platform architecture, and AI enablement. You will be responsible for building a middleware semantic layer (using Cube Core or similar technologies) that exposes clean, governed, multi-tenant data via standardized APIs and tool interfaces, enabling AI agents and LLMs to query, reason over, and act on Zeta’s data with high performance, security, and compliance.
This is a high-impact, high-visibility role that will shape how Zeta’s AI systems consume and interact with data across the organization.
What You’ll Do
Semantic Layer Architecture & Development
- Design and build a centralized semantic data layer using Cube Core (or equivalent technology such as Headless BI, dbt Metrics Layer, or Metriql) that provides a unified, governed abstraction over all company data sources.
- Define semantic models, metrics, dimensions, and relationships that map to business domains across marketing, advertising, identity resolution, and customer analytics.
- Expose the semantic layer via REST/GraphQL APIs and MCP-compatible tool interfaces purpose-built for consumption by AI agents and LLMs.
Data Source Integration & Unification
- Integrate and unify data from heterogeneous systems including MySQL, DynamoDB, Aerospike, Snowflake, Amazon S3 (data lakes), Apache Kafka, Amazon SQS, and other internal data stores.
- Build connectors, adapters, and federation layers to query across both operational (OLTP) and analytical (OLAP) data sources in a performant, cost-efficient manner.
- Ensure seamless handling of both data at rest (warehouses, lakes, databases) and data in motion (streaming platforms, event buses, message queues).
AI & LLM Enablement
- Design tool interfaces and API contracts that allow AI agents to discover available data, understand schema semantics, and generate accurate queries autonomously.
- Collaborate with AI/ML teams to optimize the semantic layer for LLM-generated SQL, natural language querying, retrieval-augmented generation (RAG), and agentic workflows.
- Implement guardrails, query validation, and cost controls to prevent runaway queries from AI-generated workloads.
Multi-Tenancy, Security & Compliance
- Architect the semantic layer with native multi-tenant isolation, ensuring strict data segregation and tenant-scoped access controls.
- Implement row-level security, column-level masking, and attribute-based access controls (ABAC) to enforce data governance policies.
- Ensure compliance with SOC 2, GDPR, CCPA, and industry-specific regulations governing data access, PII handling, and cross-border data flows.
Performance, Scalability & Reliability
- Design for horizontal scalability to support thousands of concurrent queries from AI agents, internal dashboards, and customer-facing products.
- Implement intelligent caching (pre-aggregation, materialized views, query result caching) to deliver sub-second response times for common query patterns.
- Build observability into the semantic layer with comprehensive metrics, logging, alerting, and query performance profiling.
Technical Leadership & Collaboration
- Serve as the technical authority on data architecture decisions, authoring ADRs (Architecture Decision Records) and reference architectures.
- Mentor and guide senior engineers on best practices for semantic modeling, data governance, and API design.
- Partner cross-functionally with Product, Data Science, Platform Engineering, InfoSec, and Compliance teams to align the data layer with business objectives.
What We’re Looking For
Required Qualifications
- 10+ years of experience in data engineering, data architecture, or platform engineering, with at least 3 years operating at a Staff/Principal level.
- Deep hands-on expertise with multiple data stores: relational (MySQL/PostgreSQL), NoSQL (DynamoDB, Aerospike, MongoDB), cloud data warehouses (Snowflake, BigQuery, Redshift), and data lakes (S3, Delta Lake, Iceberg).
- Strong experience with streaming and messaging systems: Apache Kafka, Amazon SQS/SNS, Kinesis, or equivalent.
- Proven experience building or operating semantic/metrics layers using Cube.js/Cube Core, dbt Metrics, LookML, or similar technologies.
- Expert-level SQL skills and experience with query optimization across distributed systems.
- Production experience designing multi-tenant data platforms with strict security and isolation requirements.
- Strong understanding of data governance, access control models (RBAC, ABAC), and compliance frameworks (SOC 2, GDPR, CCPA).
- Experience designing and exposing APIs (REST, GraphQL) for data consumption at scale.
- BS/MS in Computer Science, Data Engineering, or equivalent practical experience.
Preferred Qualifications
- Experience building data interfaces specifically for AI/ML consumption, including tool-use APIs for LLM agents, MCP (Model Context Protocol), or function-calling patterns.
- Familiarity with AI orchestration frameworks (LangChain, LlamaIndex, Semantic Kernel) and how they interact with external data tools.
- Experience with infrastructure-as-code (Terraform, Pulumi), container orchestration (Kubernetes, ECS), and CI/CD pipelines for data platform deployments.
- Background in MarTech/AdTech data domains: identity graphs, audience segmentation, campaign analytics, attribution modeling, or real-time bidding data.
- Contributions to open-source data tools or published thought leadership on semantic layers, data mesh, or AI-enabled data architectures.
BENEFITS & PERKS
- Unlimited PTO
- Excellent medical, dental, and vision coverage
- Employee Equity
- Employee Discounts, Virtual Wellness Classes, and Pet Insurance And more!!
SALARY RANGE
The salary range for this role is $170,000 - $200,000, depending on location and experience.
PEOPLE & CULTURE AT ZETA
Zeta considers applicants for employment without regard to, and does not discriminate on the basis of an individual’s sex, race, color, religion, age, disability, status as a veteran, or national or ethnic origin; nor does Zeta discriminate on the basis of sexual orientation, gender identity or expression.
We’re committed to building a workplace culture of trust and belonging, so everyone feels invited to bring their whole selves to work. We provide a forum for employees to celebrate, support and advocate for one another. Learn more about our commitment to diversity, equity and inclusion here: https://zetaglobal.com/blog/a-look-into-zetas-ergs/
ZETA IN THE NEWS!
https://zetaglobal.com/press/?cat=press-releases
#LI-YW1
Top Skills
Similar Jobs at Zeta Global
What you need to know about the Chicago Tech Scene
Key Facts About Chicago Tech
- Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
- Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
- Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
- Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

