Graphcore Logo

Graphcore

Hardware Reliability Engineer

Reposted 4 Days Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in 台北市
Expert/Leader
Remote or Hybrid
Hiring Remotely in 台北市
Expert/Leader
The Reliability Engineer at Graphcore is responsible for ensuring the system-level reliability of AI servers, conducting various environmental and mechanical tests, performing failure analysis, and leading design reviews to mitigate risks.
The summary above was generated by AI
About Graphcore

At Graphcore, we’re building the future of AI compute.We’re a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem.To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence.

Job Summary 

Responsible for system-level reliability of AI servers with liquid cooling and HVDC architectures, owning reliability validation, shock & vibration robustness, and failure analysis from board to rack level to ensure safe transport, deployment, and long-term datacenter operation. 

Key Responsibilities and skills

  • Plan and execute reliability validation across board, server, and rack levels. 
  • Define and run environmental, accelerated, and mechanical tests, including thermal/power cycling, humidity, corrosion, shock & vibration, and HALT/HASS. 
  • Lead shock & vibration validation for transportation, handling, seismic, and operational conditions. 
  • Assess reliability risks for liquid cooling systems (leakage, fatigue, pump life, corrosion, coolant stability). 
  • Evaluate HVDC mechanical and electrical robustness (busbars, connectors, power interfaces). 
  • Perform reliability prediction and life data analysis (Weibull, MTBF). 
  • Lead cross-functional design reviews and drive risk mitigation. 
  • Conduct failure analysis and RCA using standard FA methodologies. 
  • Define and maintain reliability and S&V test specifications (JEDEC, Telcordia GR-63, JESD22, MIL-STD-810, ISTA, ASHRAE, UL, IEC). 
  • Implement On-going Reliability Test (ORT) for production quality. 
  • Document results and support customer audits and certifications. 

Qualifications 

  • Bachelor’s or Master’s degree in Mechanical, Electrical, Reliability, Materials, or related Engineering. 
  • 10+ years of reliability engineering experience in AI servers, datacenter systems, HPC, or complex electronics. 
  • Hands-on experience with environmental, shock, and vibration testing. 
  • Strong knowledge of reliability methodologies and statistical analysis. 
  • Practical experience with liquid cooling and HVDC systems. 
  • Proven failure analysis and RCA capability. 
  • Strong communication skills in English; Mandarin a plus. 

 Preferred Experience 

  • AI server architecture and large-scale liquid cooling systems. 
  • FEA/modal analysis and test correlation. 
  • Datacenter, telecom, and transportation standards knowledge. 
  • Reliability certification (e.g., ASQ CRE). 

Benefits
In addition to a competitive salary, Graphcore offers flexible working and a generous annual leave policy. Alongside Taiwan's National Health Insurance scheme, we provide private medical insurance, life and critical illness insurance, and the option to extend healthcare coverage to family members. We also offer an annual health assessment allowance and eye care reimbursement to support your overall wellbeing. We welcome people of different backgrounds and experiences; we're committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments

Similar Jobs at Graphcore

4 Days Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Semiconductor
As a Shock, Vibration, & Transportation Engineer, you will ensure AI hardware reliability through testing, analyzing failures, and improving product robustness for safe shipment.
Top Skills: Astm StandardsFailure Analysis ToolsIsta StandardsLiquid Cooling SystemsMechanical Inspection
5 Days Ago
Remote or Hybrid
5-5 Annually
Senior level
5-5 Annually
Senior level
Artificial Intelligence • Semiconductor
Responsible for sourcing PCBA components, managing supplier relations, negotiating contracts, and driving cost optimization while collaborating with engineering and manufacturing teams.
Top Skills: Pcba ComponentsProcurementSupply Chain Management
18 Days Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Semiconductor
Provide IT support for Windows, macOS, and Linux systems, collaborating with engineering teams, troubleshooting issues, and managing user accounts.
Top Skills: AnsibleAWSLinuxmacOSMicrosoft 365OraclePuppetSlackWindows 11Zoom

What you need to know about the Chicago Tech Scene

With vibrant neighborhoods, great food and more affordable housing than either coast, Chicago might be the most liveable major tech hub. It is the birthplace of modern commodities and futures trading, a national hub for logistics and commerce, and home to the American Medical Association and the American Bar Association. This diverse blend of industry influences has helped Chicago emerge as a major player in verticals like fintech, biotechnology, legal tech, e-commerce and logistics technology. It’s also a major hiring center for tech companies on both coasts.

Key Facts About Chicago Tech

  • Number of Tech Workers: 245,800; 5.2% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: McDonald’s, John Deere, Boeing, Morningstar
  • Key Industries: Artificial intelligence, biotechnology, fintech, software, logistics technology
  • Funding Landscape: $2.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Pritzker Group Venture Capital, Arch Venture Partners, MATH Venture Partners, Jump Capital, Hyde Park Venture Partners
  • Research Centers and Universities: Northwestern University, University of Chicago, University of Illinois Urbana-Champaign, Illinois Institute of Technology, Argonne National Laboratory, Fermi National Accelerator Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account