uGEMM: Unary Computing for GEMM Applications

General matrix multiplication (GEMM) is pervasive in various domains, such as signal processing, computer vision, and machine learning. Conventional binary architectures for GEMM exhibit poor scalability in area and energy efficiency, due to the spatial nature of number representation and computing. On the contrary, unary computing processes data in temporal domain with extremely simple logic. However, to date, there rarely exist efficient architectures for unary GEMM. In this work, we first present uGEMM, a hardware-efficient unary GEMM architecture enabled by universally compatible arithmetic units, which simultaneously achieves input-insensitivity and high output accuracy. Next, we demonstrate that the proposed uGEMM can reliably early terminate the computation and offers dynamic energy-accuracy scaling for real-world applications via an accuracy-aware metric. Finally, to propel the future research for unary computing, we open source our unary computing simulator, UnarySim.

[1]  John P. Hayes,et al.  Exploiting correlation in stochastic circuit design , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[2]  Jongeun Lee,et al.  A new stochastic computing multiplier with application to deep convolutional neural networks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Marc D. Riedel,et al.  A deterministic approach to stochastic computation , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Di Wu,et al.  UGEMM: Unary Computing Architecture for GEMM Applications , 2020, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).

[6]  David J. Lilja,et al.  Performing Stochastic Computation Deterministically , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7]  Armin Alaghi,et al.  Architecture Considerations for Stochastic Computing Accelerators , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  Dmitri B. Strukov,et al.  Race Logic: A hardware acceleration for dynamic programming algorithms , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[9]  Di Wu,et al.  In-Stream Stochastic Division and Square Root via Correlation , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[10]  Brian R. Gaines,et al.  Stochastic Computing Systems , 1969 .