Exploiting the Potential of Computation Reuse Through Approximate Computing
暂无分享,去创建一个
Xin He | Xiaowei Li | Yinhe Han | Wenyan Lu | Guihai Yan | Shuhao Jiang
[1] Hadi Esmaeilzadeh,et al. Prediction-Based Quality Control for Approximate Accelerators , 2015 .
[2] Gurindar S. Sohi,et al. An empirical analysis of instruction repetition , 1998, ASPLOS VIII.
[3] Carlos Alvarez,et al. On the potential of tolerant region reuse for multimedia applications , 2001, ICS '01.
[4] Youfeng Wu,et al. Better exploration of region-level value locality with integrated computation reuse and value prediction , 2001, Proceedings 28th Annual International Symposium on Computer Architecture.
[5] Mikko H. Lipasti,et al. On the value locality of store instructions , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[6] John Sartori,et al. Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications , 2013, IEEE Trans. Multim..
[7] Samuel P. Harbison. An architectural alternative to optimizing compilers , 1982, ASPLOS I.
[8] Hiroshi Matsuo,et al. A Speed-up Technique for an Auto-Memoization Processor by Reusing Partial Results of Instruction Regions , 2012, 2012 Third International Conference on Networking and Computing.
[9] G.S. Sohi,et al. Dynamic instruction reuse , 1997, ISCA '97.
[10] K. Pagiamtzis,et al. Content-addressable memory (CAM) circuits and architectures: a tutorial and survey , 2006, IEEE Journal of Solid-State Circuits.
[11] Carlos Alvarez Martinez,et al. Dynamic Tolerance Region Computing for Multimedia , 2012, IEEE Transactions on Computers.
[12] Xingjian Li,et al. Floating-point mixed-radix FFT core generation for FPGA and comparison with GPU and CPU , 2011, 2011 International Conference on Field-Programmable Technology.
[13] Vladan Papic,et al. K-means image segmentation on massively parallel GPU architecture , 2012, 2012 Proceedings of the 35th International Convention MIPRO.
[14] J. Bouchaud. An introduction to statistical finance , 2002 .
[15] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[16] Scott A. Mahlke,et al. Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.
[17] Gurindar S. Sohi,et al. Understanding the differences between value prediction and instruction reuse , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.
[18] M. Valero,et al. Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.
[19] Michael S. Hsiao,et al. Region-level approximate computation reuse for power reduction in multimedia applications , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..
[20] Iain Bate,et al. Efficient integration of bimodal branch prediction and pipeline analysis , 2005, 11th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA'05).
[21] Douglas L. Jones,et al. Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[22] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[23] Timothy Sherwood,et al. Modeling TCAM power for next generation network devices , 2006, 2006 IEEE International Symposium on Performance Analysis of Systems and Software.
[24] Mahmut T. Kandemir,et al. Dynamic management of scratch-pad memory space , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).
[25] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[26] Saeid Belkasim,et al. Parallel Processing of DCT on GPU , 2011, 2011 Data Compression Conference.
[27] Anand Raghunathan,et al. Best-effort computing: Re-thinking parallel software and hardware , 2010, Design Automation Conference.
[28] Trevor Hastie,et al. An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.
[29] Wen-mei W. Hwu,et al. Compiler-directed dynamic computation reuse: rationale and initial results , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.
[30] Asit K. Mishra,et al. iACT: A Software-Hardware Framework for Understanding the Scope of Approximate Computing , 2014 .
[31] Meng-Fan Chang,et al. Energy-efficient non-volatile TCAM search engine design using priority-decision in memory technology for DPI , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[32] Joseph A. C. Delaney. Sensitivity analysis , 2018, The African Continental Free Trade Area: Economic and Distributional Effects.
[33] Rakesh Kumar,et al. On reconfiguration-oriented approximate adder design and its application , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[34] Sanjay J. Patel,et al. Y-branches: when you come to a fork in the road, take it , 2003, 2003 12th International Conference on Parallel Architectures and Compilation Techniques.
[35] Mikko H. Lipasti,et al. Exceeding the dataflow limit via value prediction , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[36] Tse-Yu Yeh. Two-level adaptive branch prediction and instruction fetch mechanisms for high performance superscalar processors , 1993 .
[37] Jason Cong,et al. Energy-efficient computing using adaptive table lookup based on nonvolatile memories , 2013, International Symposium on Low Power Electronics and Design (ISLPED).
[38] Gu-Yeon Wei,et al. Toward Cache-Friendly Hardware Accelerators , 2015 .