Computing approximately, and efficiently

Recent years have witnessed significant interest in the area of approximate computing. Much of this interest stems from the quest for new sources of computing efficiency in the face of diminishing benefits from technology scaling. We argue that trends in computing workloads will greatly increase the opportunities for approximate computing, describe the vision and key principles that have guided our work in this area, and outline a range of approximate computing techniques that we have developed at all layers of the computing stack, spanning circuits, architecture, and software.

[1]  S. Hamid Nawab,et al.  Probabilistic complexity analysis for a class of approximate DFT algorithms , 1996, J. VLSI Signal Process..

[2]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[3]  Pradeep Dubey,et al.  Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications , 2008, Proceedings of the IEEE.

[4]  Kaushik Roy,et al.  ASLAN: Synthesis of approximate sequential circuits , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Anand Raghunathan,et al.  Best-effort parallel execution framework for Recognition and mining applications , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[6]  Kaushik Roy,et al.  AxNN: Energy-efficient neuromorphic systems using approximate computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[7]  Kaushik Roy,et al.  IMPACT: IMPrecise adders for low-power approximate computing , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[8]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[9]  Surendra Byna,et al.  Exploiting the forgiving nature of applications for scalable parallel execution , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[10]  Kaushik Roy,et al.  MACACO: Modeling and analysis of circuits for approximate computing , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[11]  Chaitali Chakrabarti,et al.  Design methodology to trade off power, output quality and error resiliency: application to color interpolation filtering , 2007, ICCAD 2007.

[12]  Kaushik Roy,et al.  SALSA: Systematic logic synthesis of approximate circuits , 2012, DAC Design Automation Conference 2012.

[13]  Anand Raghunathan,et al.  Best-effort computing: Re-thinking parallel software and hardware , 2010, Design Automation Conference.

[14]  Vijay K. Madisetti,et al.  The Digital Signal Processing Handbook , 1997 .

[15]  Douglas L. Jones,et al.  Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[16]  Karthikeyan Sankaralingam,et al.  Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.

[17]  Hugh Garraway Parallel Computer Architecture: A Hardware/Software Approach , 1999, IEEE Concurrency.

[18]  Anand Raghunathan,et al.  Relax-and-Retime: A methodology for energy-efficient recovery based design , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[19]  Karthikeyan Sankaralingam,et al.  Dark silicon and the end of multicore scaling , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[20]  Robert H. Dennard,et al.  A 30 Year Retrospective on Dennard's MOSFET Scaling Paper , 2007 .

[21]  R.H. Dennard,et al.  Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.

[22]  Christoforos E. Kozyrakis Advancing computer systems without technology progress , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[23]  Anand Raghunathan,et al.  PIC: Partitioned Iterative Convergence for Clusters , 2012, 2012 IEEE International Conference on Cluster Computing.

[24]  Harald Niederreiter,et al.  Probability and computing: randomized algorithms and probabilistic analysis , 2006, Math. Comput..

[25]  K. Steinhubl Design of Ion-Implanted MOSFET'S with Very Small Physical Dimensions , 1974 .

[26]  Kaushik Roy,et al.  Process Variation Tolerant Low Power DCT Architecture , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[27]  Lingamneni Avinash,et al.  Ten Years of Building Broken Chips: The Physics and Engineering of Inexact Computing , 2013, TECS.

[28]  Surendra Byna,et al.  Best-effort semantic document search on GPUs , 2010, GPGPU-3.

[29]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[30]  Kaushik Roy,et al.  A process variation aware low power synthesis methodology for fixed-point FIR filters , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[31]  Kaushik Roy,et al.  StoRM: A Stochastic Recognition and Mining processor , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[32]  Bruce S. Davie,et al.  Computer Networks, Fifth Edition: A Systems Approach , 2017 .

[33]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[34]  Kaushik Roy,et al.  Approximate computing: An integrated hardware approach , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.

[35]  Naresh R. Shanbhag,et al.  Energy-efficient signal processing via algorithmic noise-tolerance , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).

[36]  Kaushik Roy,et al.  Significance driven computation: a voltage-scalable, variation-aware, quality-tuning motion estimator , 2009, ISLPED.

[37]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[38]  Werner Vogels,et al.  Eventually consistent , 2008, CACM.

[39]  Luis Ceze,et al.  Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.

[40]  Kaushik Roy,et al.  Design of voltage-scalable meta-functions for approximate computing , 2011, 2011 Design, Automation & Test in Europe.

[41]  Kaushik Roy,et al.  Scalable Effort Hardware Design , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[42]  B. Liu,et al.  Effect of finite word length on the accuracy of digital filters--a review , 1971 .

[43]  Kaushik Roy,et al.  Substitute-and-simplify: A unified design paradigm for approximate and quality configurable circuits , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[44]  Kaushik Roy,et al.  Quality programmable vector processors for approximate computing , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[45]  Riccardo Bettati,et al.  Imprecise computations , 1994, Proc. IEEE.

[46]  Kaushik Roy,et al.  Low complexity digital signal processing system design techniques , 2005 .

[47]  Kaushik Roy,et al.  Scalable effort hardware design: Exploiting algorithmic resilience for energy efficiency , 2010, Design Automation Conference.

[48]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[49]  Kaushik Roy,et al.  Dynamic effort scaling: Managing the quality-efficiency tradeoff , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[50]  Kaushik Roy,et al.  An Optimal Algorithm for Low Power Multiplierless FIR Filter Design using Chebychev Criterion , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[51]  Anantha Chandrakasan,et al.  Approximate Signal Processing , 1997, J. VLSI Signal Process..