Deterministic Stochastic Computation Using Parallel Datapaths

Although one of the main appeals of Stochastic Computation is that complicated arithmetic operations can be performed with incredibly simple circuitry, it suffers from high latency due to the required bit stream length and the area incurred from generating them. Using deterministic bit streams addresses the area problem to some degree, but latency is still an issue. One of the main contributors to this high latency is the fact that each operation is performed serially, and we argue that due to the uniform nature of deterministic stochastic representations this is not necessary. Using the multiply-accumulate (MAC) operation as the target application, our research addresses this latency issue by exploiting data-level parallelism: the bottleneck of a single arithmetic unit is broken by splitting into multiple “parallel datapaths”, effectively reducing overall latency by a factor of $ at the cost of area. We demonstrate how modifying the amount of parallelism can provide a means of trading off some of these performance gains with area reduction. Furthermore, we show that this design can deliver considerable performance and energy-saving benefits for an operation that has broad applications in Machine Learning: the inner-product. Finally, we will show that this drastic reduction in computation time through parallelization can pays dividends on energy consumption per operation, with the fully parallelized MAC circuit consuming nearly 4x less energy than its serial counterpart.

[1]  Kiyoung Choi,et al.  An energy-efficient random number generator for stochastic circuits , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[2]  John P. Hayes,et al.  Survey of Stochastic Computing , 2013, TECS.

[3]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[4]  Brian R. Gaines,et al.  Stochastic Computing Systems , 1969 .

[5]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[6]  Kia Bazargan,et al.  Low latency parallel implementation of traditionally-called stochastic circuits using deterministic shuffling networks , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[7]  Tsuyoshi Iwagaki,et al.  Compact and accurate stochastic circuits with shared random number sources , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[8]  Marian Verhelst,et al.  Energy-Efficiency and Accuracy of Stochastic Computing Circuits in Emerging Technologies , 2014, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[9]  Marc D. Riedel,et al.  A deterministic approach to stochastic computation , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[10]  John P. Hayes,et al.  Achieving progressive precision in stochastic computing , 2017, 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP).