An Ultra-High Energy-Efficient Reconfigurable Processor for Deep Neural Networks with Binary/Ternary Weights in 28NM CMOS

An energy efficient reconfigurable processor for deep neural networks with binary/ternary weights and 1/2/4/8/16-bit activations is implemented in 28nm technology. Three technologies, Total- Partial- Pixel-Summation (TPPS), Kernel-Transformation-Data-Reconstruction (KTDR) and Hybrid Load-Balancing Mechanism (HLBM), are employed to improve energy efficiency. Measurement results show that the energy efficiency of at most 95.8 TOPS/w for BWN, and 95.1 TOPS/W for TWN and 765.6 TOPS/w for BNN is achieved, and it shows 6.6x higher over state-of-the-art works.