Impact of NCFET Technology on Eliminating the Cooling Cost and Boosting the Efficiency of Google TPU

Recent breakthroughs in Neural Networks (NNs) led to significant accuracy improvements. This accuracy improvement comes at the cost of immense increase in computation demands. NNs became one of the most common and computationally intensive workloads in today's datacenters. To address these computational demands, Google announced in 2016 the Tensor Processing Unit (TPU), an advanced custom ASIC accelerator for NN inference. Two new TPU versions (v2 and v3) followed that support also training. Google TPUv3 packs immense processing power in a tiny and condensed area, leading to very high on-chip power densities and thus excessive temperature. In this work, superlattice thermoelectric cooling, which is one of the emerging on-chip cooling, is considered as an advanced cooling example for Google TPU and we investigate the impact of Negative Capacitance FET (NCFET) on the cooling and efficiency of TPU. Our results demonstrate that NCFET can significantly minimize the required cooling-cost. We explore all NCFET configurations including the thickness of the ferroelectric layer of NCFET, the operating voltage, cooling, and the operating frequency, in addition to all possible FinFET's configurations. Moreover, our experimental evaluation shows that by eliminating the cooling cost, NCFET delivers 2.8x higher efficiency compared to the conventional FinFET baseline.