Unknown Company
SOC Quality and Reliability Engineer, Google Cloud
Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Tel Aviv, Israel; Haifa, Israel.
Google’s data centers are the most advanced in the world. In this role, you will help build the state-of-the-art SoC’s that power these data centers by driving quality and reliability processes from the Integrated Circuit perspective. You will have an opportunity to create silicon and follow it into the field and back to drive improvements for the next-generations of chips.
You will have an understanding of Integrated Circuit (IC) flows, wafer processing, testing, qualification, yield, reliability, and failure analysis is expected. You will work with various cross-functional teams to develop quality and reliability specifications, develop and deploy design guidelines, and develop and execute and test plans. You will collaborate with global hardware quality and reliability teams, silicon design, validation and engineering teams.The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're behind Google's groundbreaking innovations, empowering the development of AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Minimum qualifications:
- Bachelor's degree in Electrical Engineering, Materials Science, Physics, or a related field or equivalent practical experience.
- 4 years of experience in Integrated Circuit (IC) silicon quality or reliability.
- Experience leading the product reliability life-cycle from post-tapeout through high-volume manufacturing.
- Experience with semiconductor complementary metal-oxide-semiconductor (CMOS) technology, device physics, and failure mechanisms.
Preferred qualifications:
- Master's degree in Electrical Engineering, Materials Science, or related field.
- Expertise in statistical data analysis using tools such as JMP, Python, or JMP Scripting Language (JSL).
- Familiarity with electrical failure analysis (EFA) and physical failure analysis (PFA) techniques.
- Knowledge of design-for-reliability (DfR) rules and implementation techniques.
- Track record with silicon reliability on process nodes and advanced packaging technologies.
About the job
In this role, you’ll work to shape the future of AI/ML hardware acceleration. You will have an opportunity to drive TPU (Tensor Processing Unit) technology that powers Google's most demanding AI/ML applications. You’ll be part of a team that pushes boundaries, developing custom silicon solutions that power the future of Google's TPU. You'll contribute to the innovation behind products loved by millions worldwide, and leverage your design and verification expertise to verify complex digital designs, with a specific focus on TPU architecture and its integration within AI/ML-driven systems.Google’s data centers are the most advanced in the world. In this role, you will help build the state-of-the-art SoC’s that power these data centers by driving quality and reliability processes from the Integrated Circuit perspective. You will have an opportunity to create silicon and follow it into the field and back to drive improvements for the next-generations of chips.
You will have an understanding of Integrated Circuit (IC) flows, wafer processing, testing, qualification, yield, reliability, and failure analysis is expected. You will work with various cross-functional teams to develop quality and reliability specifications, develop and deploy design guidelines, and develop and execute and test plans. You will collaborate with global hardware quality and reliability teams, silicon design, validation and engineering teams.The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're behind Google's groundbreaking innovations, empowering the development of AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.
Responsibilities
- Lead the strategic definition and development of Design-for-Reliability (DfR) guidelines, collaborating with cross-functional subject matter experts to integrate reliability into early design stages.
- Establish and direct the development of qualification hardware and test methodologies, managing internal teams and external vendors to ensure silicon and package verification.
- Execute comprehensive silicon and package qualification programs (including high-temperature operating life (HTOL), early life failure rate (ELFR), electrostatic discharge and latch-up (ESD/LU), and biased highly accelerated stress test (b/HAST)) and conduct in-depth failure analysis to resolve quality issues.
- Analyze data from qualification programs, high-volume manufacturing, and field returns to identify failure mechanisms and trends for yield and reliability optimization.
- Develop and implement physics-based statistical quality and reliability models (e.g., early life failure (ELF), time-dependent dielectric breakdown (TDDB), or negative bias temperature instability (NBTI)) to predict device failure mechanisms and lifetime behaviors.
Additional Information
- Published
- 2026-06-23T07:39:49.199Z
- Url
- https://careers.google.com/jobs/results/98433050534126278-soc-quality-and-reliability-engineer/
- Jobtype
- FULL_TIME
- Employer
- Languagecode
- en-US
- Remote
- onsite
- Isremote
- No
- Ishybrid
- No
- City
- Tel AvivHaifa
- Country
- IsraelIsrael