NVIDIA Logo

NVIDIA

Director, Software Engineering - DGX Cloud Infrastructure

Sorry, this job was removed at 04:08 a.m. (CST) on Saturday, Jun 28, 2025
In-Office or Remote
5 Locations
In-Office or Remote
5 Locations

Similar Jobs

An Hour Ago
Remote or Hybrid
United States
91K-169K Annually
Senior level
91K-169K Annually
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Engagement Manager oversees multiple projects, manages client relationships, and ensures delivery of SailPoint solutions, focusing on project management and sales efforts.
Top Skills: Project ManagementSaaSSoftware
An Hour Ago
Remote
USA
175K-215K Annually
Senior level
175K-215K Annually
Senior level
Artificial Intelligence • Information Technology • Marketing Tech • Software • SEO
The Senior Software Engineer will develop critical features for Agent Experience at Scrunch, focusing on full-stack solutions and infrastructure for marketing teams, optimizing APIs, databases, and cloud services.
Top Skills: Application Load BalancersCloud ServicesCloud StorageDrizzleGoMessage QueuesPythonRuby on RailsRest ApisSqlalchemyTypescript
5 Hours Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
Mid level
Mid level
Fintech • Mobile • Software • Financial Services
The Senior Analyst will oversee and monitor Market, Liquidity, and Capital risks, conducting reviews, developing tools, and collaborating with finance and risk teams.
Top Skills: PythonSQL

NVIDIA is seeking a strategic and technically grounded Director of Engineering to lead a high-impact organization at the intersection of core compute cloud infrastructure for AI factories. This organization is a key pillar in NVIDIA’s DGX Cloud ecosystem, building shared automation and reliability tooling that enables a sizable portion of our GPU-accelerated compute fleet.

You will further develop and scale an organization of engineers focused on running production software for large scale GPU-accelerated infrastructure. This organization partners closely with storage, networking, and several other teams across NVIDIA. You will be the engineering leader responsible for interfacing with some of our NVIDIA Cloud Partners to continuously meet our production excellence goals.

What You’ll Be Doing:

  • Build and grow a team of software engineers and leaders focused on automating day 0, 1, and 2 for large-scale GPU clusters running on bare metal and public clouds with service levels of various kinds.

  • Lead the design and continuous delivery of shared automation frameworks aligned with SLOs and error budgets.

  • Liaise with some of our NVIDIA Cloud Partners to ensure aligned priorities and sustained production excellence.

  • Drive clarity and execution through high ambiguity, translating broad, and ever evolving objectives into iterative delivery milestones.

  • Enable internal teams by reducing operational friction and improving automation coverage across the stack.

What We Need To See:

  • Proven experience leading software engineering teams (incl. SRE and/or DevOps) responsible for infrastructure automation, and distributed systems.

  • Demonstrated ability to build software engineering organizations, driving continuous incremental execution across teams, and operate effectively in highly ambiguous environments with ever evolving objectives.

  • Hands-on experience designing, running, or automating cloud infrastructure atop bare metal platforms and/or VMs.

  • Experience deploying cloud-native services on public clouds.

  • Track record of representing your company or division in external partnerships with public clouds, infrastructure vendors, and to internal partner teams.

  • Strong foundation in incremental delivery, and technical program execution.

  • Excellent written and verbal communication skills, with the ability to influence across levels and disciplines.

  • Bachelor of Science (or equivalent experience) or Master of Science degree in Computer Science or related field, with a minimum of 10+ overall years of experience developing and leading cloud infrastructure teams, and 5+ yrs of management experience

Ways to stand out from the crowd:

  • Relevant experience developing organizations at public cloud companies. Background leading teams running large-scale GPU clusters. Familiarity with technologies like Linux, NVIDIA BCM, Slurm, Infiniband, Kubernetes, Slurm, distributed storage, or BlueField DPUs.

  • Experience developing both internal-facing platform teams and customer-facing infrastructure as a service ones.

  • Track record of collaboration with security, or compliance teams including in regulated environments. Familiarity with AI/ML platform workloads and their reliability or performance characteristics.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. If you're creative, hard-working and self-motivated, we want to hear from you! NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing (HPC) and Visualization. DGX Cloud provides a serverless generative AI infrastructure to the world enabling NVIDIA’s AI supercomputer technologies to be used by anyone.

The base salary range is 284,000 USD - 425,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account