Systematic co-optimization from chip design, process technology to systems for GPU AI chip

In this paper we present a systematic approach to identify, predict and optimize design process interaction (DPI); and to co-optimize holistically from technology to chip/system. This has enabled the best performance, power and yield for GPU/SOC for high performance computing (HPC), artificial intelligence (AI) and autonomous vehicles applications. GPU HPC improved 100× in past 10 years, more than Moore's Law. AI deep learning performance has improved even faster, by ∼100× in 5 years, enabled by co-optimizations in architecture, circuit, and process technology. The era of defects per trillion (DPPT) level of perfection has also arrived. DPPT criteria for defect and variability outlier control is required to meet yield and reliability for today's complex GPU with 100s of billions of layout pieces. Intelligent test chip designs are employed to identify issues and margins, to help predict large chip and high-volume issues with smart sample learning. TCAD defect/reliability margin modeling, design for manufacturing (DFM), design for reliability (DFR), are also used to optimize design and process, to maximize PPAYRT (power, performance, area, yield, reliability, time to market).