论文信息 - RMAC: Runtime Configurable Floating Point Multiplier for Approximate Computing

RMAC: Runtime Configurable Floating Point Multiplier for Approximate Computing

Approximate computing is a way to build fast and energy efficient systems, which provides responses of good enough quality tailored for different purposes. In this paper, we propose a novel approximate floating point multiplier which efficiently multiplies two floating numbers and yields a high precision product. RMAC approximates the costly mantissa multiplication to a simple addition between the mantissa of input operands. To tune the level of accuracy, RMAC looks at the first bit of the input mantissas as well as the first N bits of the result of addition to dynamically estimate the maximum multiplication error rate. Then, RMAC decides to either accept the approximate result or re-execute the exact multiplication. Depending on the value of N, the proposed RMAC can be configured to achieve different levels of accuracy. We integrate the proposed RMAC in AMD southern Island GPU, by replacing RMAC with the existing floating point units. We test the efficiency and accuracy of the enhanced GPU on a wide range of applications including multimedia and machine learning applications. Our evaluations show that a GPU enhanced by the proposed RMAC can achieve 5.2x energydelay product improvement as opposed to GPU using conventional FPUs while ensuring less than 2% quality loss. Comparing our approach with other state-of-the-art approximate multipliers shows that RMAC can achieve 3.1x faster and 1.8x more energy efficient computations while providing the same quality of service.

Tajana Simunic | Mohsen Imani | Saransh Gupta | Ricardo Garcia

[1] Tajana Simunic,et al. MASC: Ultra-low energy multiple-access single-charge TCAM for approximate computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[2] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .

[3] Tajana Simunic,et al. ReMAM: Low energy Resistive Multi-stage Associative Memory for energy efficient computing , 2016, 2016 17th International Symposium on Quality Electronic Design (ISQED).

[4] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[5] Farinaz Koushanfar,et al. CAMsure: Secure Content-Addressable Memory for Approximate Search , 2017, ACM Trans. Embed. Comput. Syst..

[6] David Blaauw,et al. Design Methodology for Voltage-Overscaled Ultra-Low-Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[7] Kaushik Roy,et al. Design of voltage-scalable meta-functions for approximate computing , 2011, 2011 Design, Automation & Test in Europe.

[8] Fabrizio Lombardi,et al. A low-power, high-performance approximate multiplier with configurable partial error recovery , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[9] Tajana Simunic,et al. CANNA: Neural network acceleration using configurable approximation on GPGPU , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[10] Puneet Gupta,et al. Trading Accuracy for Power with an Underdesigned Multiplier Architecture , 2011, 2011 24th Internatioal Conference on VLSI Design.

[11] Jie Han,et al. Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).

[12] Felix Wortmann,et al. Internet of Things , 2015, Business & Information Systems Engineering.

[13] Ku He,et al. Circuit-Level Timing-Error Acceptance for Design of Energy-Efficient DCT/IDCT-Based Systems , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[14] Sherief Reda,et al. DRUM: A Dynamic Range Unbiased Multiplier for approximate applications , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[15] Mohsen Imani,et al. Ultra-efficient processing in-memory for data intensive applications , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[16] Taejoon Park,et al. Energy-Efficient Approximate Multiplication for Digital Signal Processing and Classification Applications , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[17] Tajana Simunic,et al. CFPU: Configurable floating point multiplier for energy-efficient computing , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18] Marimuthu Palaniswami,et al. Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[19] Tajana Simunic,et al. ORCHARD: Visual object recognition accelerator based on approximate in-memory processing , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[20] J. von Neumann,et al. Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components , 1956 .

[21] Tajana Simunic,et al. ACAM: Approximate Computing Based on Adaptive Associative Memory with Online Learning , 2016, ISLPED.

[22] Rajesh K. Gupta,et al. Energy-efficient neural networks using approximate computation reuse , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[23] Mohsen Imani,et al. Approximate Computing Using Multiple-Access Single-Charge Associative Memory , 2018, IEEE Transactions on Emerging Topics in Computing.

[24] Gerhard Fettweis,et al. The global footprint of mobile communications: The ecological and economic perspective , 2011, IEEE Communications Magazine.

[25] Hui Yang,et al. Multifractal Analysis of Image Profiles for the Characterization and Detection of Defects in Additive Manufacturing , 2018 .

[26] Kartikeya Bhardwaj,et al. Power- and area-efficient Approximate Wallace Tree Multiplier for error-resilient systems , 2014, Fifteenth International Symposium on Quality Electronic Design.

[27] David R. Kaeli,et al. Multi2Sim: A simulation framework for CPU-GPU computing , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[28] Tajana Simunic,et al. Resistive configurable associative memory for approximate computing , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).