EnCharge AI Logo

EnCharge AI

Research Engineer, AI Models

Posted 5 Hours Ago
Remote
Hiring Remotely in US
Senior level
Remote
Hiring Remotely in US
Senior level
Develop and optimize AI models for efficient inference on custom silicon. Build fine-tuning pipelines, implement acceleration techniques (quantization, sparsity, distillation), create benchmarking and profiling tools, and collaborate with hardware, compiler, and quantization teams to translate algorithmic improvements into real hardware gains.
The summary above was generated by AI

Research Engineer, AI Models

Location: San Francisco, CA (or Remote-friendly with travel)

About EnCharge AI:

EnCharge AI is building the next generation AI platform. Our novel in-memory-computing architecture delivers a 10x step-function improvement in compute energy efficiency and performance for AI inference workloads. As the demands of artificial intelligence move beyond today's models, we believe fundamental underlying infrastructure must evolve. We are an experienced team of AI researchers, silicon & systems engineers, and architects backed by leading investors, poised to become the essential platform for the next wave of AI innovation.

The Opportunity:

Modern AI workloads—from large language models to diffusion-based generators to multimodal systems—represent some of the most compute-intensive frontiers in AI, and some of the most promising applications for our hardware’s energy efficiency advantages. We’re building a vertically integrated AI stack that will showcase the transformative potential of our silicon while delivering real value to customers today.

We are seeking a Research Engineer to push the boundaries of AI model quality and efficiency. You’ll build fine-tuning pipelines, develop rigorous benchmarking frameworks, and work at the intersection of ML research and hardware-aware optimization—ensuring our models run beautifully on our silicon.

This is a role for someone who thrives at the boundary between research and engineering. You’ll read papers, implement techniques, and ship production-quality code—all in service of making AI inference faster, cheaper, and better.

Key Responsibilities:

  • Algorithmic Acceleration: Research and implement state-of-the-art techniques to accelerate AI inference—quantization, sparsity, distillation, speculative decoding, caching strategies, and architectural modifications. Systematically characterize tradeoffs between model quality, latency, throughput, and power consumption to find optimal operating points across different use cases.
  • Hardware Co-Design: Partner closely with hardware, compiler, and quantization teams to ensure algorithmic improvements translate to real gains on our silicon. Identify optimizations aligned with our architecture's strengths—maximizing throughput while minimizing power. Shape the feedback loop between model development and hardware roadmap.
  • Evaluation: Build profiling tools and comprehensive benchmarking frameworks to understand compute bottlenecks, measure model quality across standard and domain-specific evals, and track efficiency metrics. Establish the methodology that informs both algorithmic choices and hardware-software co-design.
  • Applied Research: Build robust fine-tuning workflows for modern AI models, enabling rapid experimentation with LoRA, adapters, and full fine-tuning. Stay current with the rapidly evolving landscape—evaluate new architectures, implement promising techniques, and contribute insights that inform technical and go-to-market strategy.

Qualifications:

  • 5+ years of experience in ML research, applied ML, or ML systems
  • Strong fundamentals in Python and PyTorch
  • Hands-on experience with modern AI models (transformers, diffusion models, or other generative architectures)
  • Experience fine-tuning large models and building training/evaluation pipelines
  • Deep understanding of transformers, attention mechanisms, & optimization techniques
  • Comfort reading and implementing techniques from research papers

Nice to Have:

  • Experience with efficient inference techniques (KV cache optimization, attention variants, MoE routing, flow matching)
  • Background in hardware-aware ML optimization or quantization
  • Familiarity with profiling tools (PyTorch Profiler, Nsight, custom instrumentation)
  • Publications in generative modeling, efficient inference, or ML systems
  • Contributions to open-source ML projects

Similar Jobs

50 Seconds Ago
Remote
United States
241K-283K Annually
Senior level
241K-283K Annually
Senior level
Healthtech • Social Impact • Software • Telehealth
Serve as the technical IC leader for the Provider organization: set technical roadmap and engineering standards, design and review architectures, drive cross-team initiatives, mentor engineers, and deliver high-throughput, low-latency, cloud-based microservice systems for provider-focused products.
Top Skills: Cloud InfrastructureDistributed SystemsMicroservicesNoSQLSQL
A Minute Ago
Remote or Hybrid
2 Locations
130K-200K Annually
Senior level
130K-200K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Senior strategic partner in Sales Strategy and Operations to optimize Falcon Flex deal structures, ensure revenue and policy compliance, lead deal reviews and approvals, perform financial modeling and analysis, advise pricing strategy, and drive cross-functional process and system improvements to scale disciplined deal execution.
Top Skills: Falcon FlexSalesforce CpqSalesforce CRM
An Hour Ago
Remote or Hybrid
65K-95K Annually
Senior level
65K-95K Annually
Senior level
Agency • Artificial Intelligence • Consumer Web • Digital Media • Analytics • Design
Manage end-to-end paid media campaigns across multiple clients and channels, including strategy, setup, optimization, tracking, and reporting. Analyze performance data, maintain tracking (GA4/Tag Manager), build reports and dashboards, and present insights to clients. Collaborate with account, creative, and analytics teams to drive growth and improve ROI while staying current on platform best practices.
Top Skills: Conversions ApiFacebook/Meta AdsGoogle AdsGoogle Analytics (Ga4)Google Looker StudioGoogle SheetsGoogle Tag ManagerLinkedin AdsMicrosoft/Bing AdsPerformance MaxPinterest AdsProgrammatic AdvertisingReddit AdsTwitter/X AdsYoutube Ads

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account