Product Quality Engineer, AI/ML, Hardware, Google Cloud
- linkCopy link
- emailEmail a friend
Minimum qualifications:
- Bachelor's degree in Electrical Engineering, Computer Engineering, Materials Science, Industrial Engineering, a related technical field, or equivalent practical experience.
- 10 years of experience in Hardware Quality, Reliability, Product Engineering, or a similar role focused on electronic systems (e.g., servers, accelerators, networking equipment).
- 8 years of experience leading cross-functional teams to solve technical problems and drive quality improvements.
- Experience with AI/ML system architectures, including TPU/GPU based platforms, key components (e.g., high-speed interconnects, power delivery), and characteristic failure modes.
Preferred qualifications:
- Master's or PhD degree in Electrical Engineering or a related field.
- Certified Reliability/Quality Engineer (CRE/CQE) certification or equivalent experience.
- 12 years of experience in Quality/Reliability, with substantial direct experience in GPU/TPU or other AI/ML accelerator hardware.
- Experience in a technical leadership role, defining quality strategy, and influencing executive stakeholders.
- Experience in a customer-facing quality role with managing executive communications and escalations for technical issues, with the ability to travel as required.
- Excellent hardware and software debugging skills, with experience in analyzing system logs, manufacturing test data, and diagnostic outputs to pinpoint root causes.
About the job
Google Cloud is powered by advanced compute, network, storage, and Artificial Intelligence (AI) platforms, built on one of the world’s largest and most sophisticated Technical Infrastructures (TI). The Cloud Supply Chain and Operations (CSCO) teams are responsible for the fast and efficient deployment of this infrastructure.
The Global Hardware Quality and Reliability (GHQR) team ensures predictable quality and reliability across all hardware components, systems including Tensor Processing Unit/Graphics Processing Unit (TPU/GPU) AI platforms and data center infrastructure. This hardware is the backbone of Google Cloud and its AI/ML capabilities, directly contributing to Google's engaged edge.
As a Product Quality Engineer, you will own the quality and reliability strategy for Google's TPU/GPU-based AI/ML platforms. You will be the quality expert, collaborating with cross-functional partners in Design, Manufacturing, and Operations to embed quality into every product. You will also analyze data, drive root cause analysis, and influence process improvements.
Responsibilities
- Define and own the quality and reliability strategy for TPU/GPU hardware across its entire life-cycle, from design through field support.
- Lead the resolution of systemic quality issues in manufacturing and the field, driving Root Cause and Corrective Actions (RCCA) using structured methodologies.
- Collaborate with engineering teams to influence design specifications, qualification plans, and test coverage to ensure product robustness and mitigate early risks.
- Establish and monitor key quality KPIs (e.g., Average Severity Rate (ASR), Average Failure Rate (AFR), etc.). Analyze manufacturing and field data to develop predictive models and drive continuous improvement in design and processes.
- Act as the primary point for customer quality, managing escalations and integrating feedback. Oversee quality and corrective actions with suppliers, including Return Material Authorizations (RMA) and Process Change Notifications (PCN) qualification.
Information collected and processed as part of your Google Careers profile, and any job applications you choose to submit is subject to Google's Applicant and Candidate Privacy Policy.
Google is proud to be an equal opportunity and affirmative action employer. We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law. See also Google's EEO Policy, Know your rights: workplace discrimination is illegal, Belonging at Google, and How we hire.
If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.
To all recruitment agencies: Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.