Photon Logo

Photon

Data Lead- Dallas, TX

Posted 2 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in United States
46K-162K Annually
Expert/Leader
In-Office or Remote
Hiring Remotely in United States
46K-162K Annually
Expert/Leader
Design and scale end-to-end data pipelines and RAG architectures for LLM consumption: ETL/ELT, vector databases, chunking/embeddings, data cleaning/PII removal, metadata engineering, low-latency streaming, and evaluation/versioning to support agentic AI.
The summary above was generated by AI

We are seeking a Lead Data Engineer to build and scale the data infrastructure powering our Agentic AI products. You will be responsible for the "Ingestion-to-Insight" pipeline that allows autonomous agents to access, search, and reason over vast amounts of proprietary and public data.

Your role is critical: you will design the RAG (Retrieval-Augmented Generation) architectures and data pipelines that ensure our agents have the right context at the right time to make accurate decisions.

Key Responsibilities

  • AI-Ready Data Pipelines: Design and implement scalable ETL/ELT pipelines that process both structured (SQL, logs) and unstructured (PDFs, emails, docs) data specifically for LLM consumption.
  • Vector Database Management: Architect and optimize Vector Databases (e.g., Pinecone, Weaviate, Milvus, or Qdrant) to ensure high-speed, relevant similarity searches for agentic retrieval.
  • Chunking & Embedding Strategies: Collaborate with AI Engineers to optimize data chunking strategies and embedding models to improve the "recall" and "precision" of the agent's knowledge retrieval.
  • Data Quality for AI: Develop automated "Data Cleaning" workflows to remove noise, PII (Personally Identifiable Information), and toxicity from training/context datasets.
  • Metadata Engineering: Enrich raw data with advanced metadata tagging to help agents filter and prioritize information during multi-step reasoning tasks.
  • Real-time Data Streaming: Build low-latency data streams (using Kafka or Flink) to provide agents with "fresh" data, enabling them to act on real-time market or operational changes.
  • Evaluation Frameworks: Construct "Gold Datasets" and versioned data snapshots to help the team benchmark agent performance over time.

Required Skills & Qualifications

  • Experience: 10+ years in Data Engineering, with at least 2 years focusing on data for LLMs or AI/ML applications.
  • Python Mastery: Deep expertise in Python (Pandas, Pydantic, FastAPI) for data manipulation and API integration.
  • Data Tooling: Strong experience with modern data stack tools (e.g., dbt, Airflow, Dagster, Snowflake, or Databricks).
  • Vector Expertise: Hands-on experience with at least one major Vector Database and knowledge of similarity search algorithms (HNSW, Cosine Similarity).
  • Search Knowledge: Familiarity with hybrid search techniques (combining semantic search with traditional keyword search like Elasticsearch/BM25).
  • Cloud Infrastructure: Proficiency in managing data workloads on AWS, Azure, or GCP.

Preferred Qualifications

  • Experience with LlamaIndex or LangChain for data ingestion.
  • Knowledge of Graph Databases (e.g., Neo4j) to help agents understand complex relationships between data points.
  • Familiarity with "Data-Centric AI" principles—prioritizing data quality over model size.

Compensation, Benefits and Duration

Minimum Compensation: USD 46,000
Maximum Compensation: USD 162,000
Compensation is based on actual experience and qualifications of the candidate. The above is a reasonable and a good faith estimate for the role.
Medical, vision, and dental benefits, 401k retirement plan, variable pay/incentives, paid time off, and paid holidays are available for full time employees.
This position is not available for independent contractors
No applications will be considered if received more than 120 days after the date of this post

Photon Chicago, Illinois, USA Office

11 East Adams Street Suite 1100, Chicago, IL, United States, 60603

Similar Jobs

22 Minutes Ago
Remote or Hybrid
Chicago, IL, USA
143K-235K Annually
Senior level
143K-235K Annually
Senior level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Lead SnackFutures Ventures investments in early and growth-stage companies, manage deal sourcing, screening, negotiation, and portfolio oversight, coordinate cross-functional teams, contribute to investment strategy, and support exits or handovers to Corporate Development.
24 Minutes Ago
Remote or Hybrid
CA, USA
37K-73K Hourly
Mid level
37K-73K Hourly
Mid level
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Provide white-glove technical support and onboarding for high-value resellers: troubleshoot integrations, manage escalations, run onboarding and training, track and drive issue resolution with engineering, and collaborate cross-functionally to improve processes and product experience for enterprise sellers.
Top Skills: APIsGoogle MeetsJIRASdksThird-Party Integrations
4 Hours Ago
Remote or Hybrid
176K-308K Annually
Senior level
176K-308K Annually
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead AI FinOps governance to track and optimize AI spend across multi-cloud providers. Build KPI frameworks, anomaly detection, guardrails, and cost-reduction programs. Coordinate engineering, finance, and cloud vendors, deliver VP-level reporting, and operate as an independent program owner driving measurable savings and contractual correctness.
Top Skills: AnthropicAWSAws BedrockAws BudgetsAzureAzure Cost ManagementAzure OpenaiBilling ApisGCPGcp Billing ControlsGcp Vertex AiGenai GatewayLlm Proxy PlatformsOpenaiPythonSQLToken Metering

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account