ApproxEigen: An approximate computing technique for large-scale eigen-decomposition

Recognition, Mining, and Synthesis (RMS) applications are expected to make up much of the computing workloads of the future. Many of these applications (e.g., recommender systems and search engine) are formulated as finding eigenvalues/vectors of large-scale matrices. These applications are inherently error-tolerant, and it is often unnecessary, sometimes even impossible, to calculate all the eigenpairs. Motivated by the above, in this work, we propose a novel approximate computing technique for large-scale eigen-decomposition, namely ApproxEigen, wherein we focus on the practically-used Krylov subspace methods to find finite number of eigenpairs. With ApproxEigen, we provide a set of computation kernels with different levels of approximation for data pre-processing and solution finding, and conduct accuracy tuning under given quality constraints. Experimental results demonstrate that ApproxEigen is able to achieve significant energy-efficiency improvement while keeping high accuracy.

[1]  Joseph Zambreno,et al.  An Efficient Architecture for Floating-Point Eigenvalue Decomposition , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.

[2]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Srihari Cadambi,et al.  A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification , 2012, TACO.

[4]  Qiang Xu,et al.  ApproxANN: An approximate computing framework for artificial neural network , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[5]  Pradeep Dubey,et al.  Convergence of Recognition, Mining, and Synthesis Workloads and Its Implications , 2008, Proceedings of the IEEE.

[6]  Nam Sung Kim,et al.  GPUWattch: enabling energy optimizations in GPGPUs , 2013, ISCA.

[7]  Hang Zhang,et al.  Low power GPGPU computation with imprecise hardware , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[8]  Henry Wong,et al.  Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[9]  Andrew B. Kahng,et al.  Accuracy-configurable adder for approximate arithmetic designs , 2012, DAC Design Automation Conference 2012.

[10]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[11]  Qiang Xu,et al.  ApproxIt: An approximate computing framework for iterative methods , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[12]  E. J. King,et al.  Data-dependent truncation scheme for parallel multipliers , 1997, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[13]  Luca Benini,et al.  An approximate computing technique for reducing the complexity of a direct-solver for sparse linear systems in real-time video processing , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[14]  Jie Han,et al.  Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[15]  David S. Watkins,et al.  The matrix eigenvalue problem - GR and Krylov subspace methods , 2007 .

[16]  Rakesh Kumar,et al.  On reconfiguration-oriented approximate adder design and its application , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[17]  Yang Liu,et al.  Hardware Efficient Architectures for Eigenvalue Computation , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[18]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[19]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[20]  Kaushik Roy,et al.  Analysis and characterization of inherent application resilience for approximate computing , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[21]  Xiaogang Wang,et al.  A unified framework for subspace face recognition , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.