Neural Acceleration for General-Purpose Approximate Programs
暂无分享,去创建一个
Luis Ceze | Doug Burger | Hadi Esmaeilzadeh | Adrian Sampson | L. Ceze | D. Burger | Adrian Sampson | H. Esmaeilzadeh
[1] L. Ceze,et al. Towards Neural Acceleration for General-Purpose Approximate Computing , 2012 .
[2] Karthikeyan Sankaralingam,et al. Relax: an architectural framework for software recovery of hardware faults , 2010, ISCA.
[3] Donald Yeung,et al. Exploiting Soft Computing for Increased Fault Tolerance , 2006 .
[4] K. Wojtek Przytula. Parallel digital implementations of neural networks , 1991, ASAP.
[5] Huawei Li,et al. A Fault Criticality Evaluation Framework of Digital Systems for Error Tolerant Video Applications , 2011, 2011 Asian Test Symposium.
[6] Steven Swanson,et al. QSCORES: Trading dark silicon for scalable energy efficiency with quasi-specific cores , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[7] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[8] Gu-Yeon Wei,et al. Process Variation Tolerant 3T1D-Based Cache Architectures , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[9] Scott A. Mahlke,et al. Bridging the computation gap between programmable processors and hardwired accelerators , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[10] Steven Swanson,et al. Conservation cores: reducing the energy of mature computations , 2010, ASPLOS XV.
[11] Robert A. van de Geijn,et al. Codesign Tradeoffs for High-Performance, Low-Power Linear Algebra Architectures , 2012, IEEE Transactions on Computers.
[12] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2014, IEEE Micro.
[13] Jeff Mason,et al. CHiMPS: A C-level compilation flow for hybrid CPU-FPGA architectures , 2008, 2008 International Conference on Field Programmable Logic and Applications.
[14] Karthikeyan Sankaralingam,et al. Dynamically Specialized Datapaths for energy efficient computing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[15] Amin Ansari,et al. Bundled execution of recurring traces for energy-efficient general purpose processing , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Olivier Temam,et al. A defect-tolerant accelerator for emerging high-performance applications , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[17] Luis Ceze,et al. Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.
[18] A. Ailamaki,et al. Toward Dark Silicon in Servers , 2011, IEEE Micro.
[19] Michael D. Smith,et al. A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[20] Lionel Tarassenko,et al. Estimations of error bounds for neural-network function approximators , 1999, IEEE Trans. Neural Networks.
[21] E. Sackinger,et al. An Analog Neural Network Processor With Programmable Network Topology , 1991, 1991 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.
[22] Dan Grossman,et al. EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.
[23] Jihan Zhu,et al. FPGA Implementations of Neural Networks - A Survey of a Decade of Progress , 2003, FPL.
[24] Henry Hoffmann,et al. Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.
[25] Douglas L. Jones,et al. Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[26] Johannes Schemmel,et al. Wafer-scale integration of analog neural networks , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[27] K. Ghose,et al. MARSSx 86 : A Full System Simulator for x 86 CPUs , 2011 .
[28] Scott A. Mahlke,et al. Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[29] Mikko H. Lipasti,et al. A case for neuromorphic ISAs , 2011, ASPLOS XVI.
[30] B. Gupta,et al. Learning on an analog VLSI neural network chip , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.
[31] Krishna V. Palem,et al. Ultra-Efficient (Embedded) SOC Architectures based on Probabilistic CMOS (PCMOS) Technology , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[32] Song Liu,et al. Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.
[33] Keechul Jung,et al. GPU implementation of neural networks , 2004, Pattern Recognit..
[34] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[35] Lawrence D. Jackel,et al. An analog neural network processor with programmable topology , 1991 .
[36] Naresh R. Shanbhag,et al. Energy-efficient signal processing via algorithmic noise-tolerance , 1999, Proceedings. 1999 International Symposium on Low Power Electronics and Design (Cat. No.99TH8477).
[37] Subhasish Mitra,et al. ERSA: Error Resilient System Architecture for probabilistic applications , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[38] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[39] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[40] Karthik Pattabiraman,et al. Flicker: Saving Refresh-Power in Mobile Devices through Critical Data Partitioning , 2009 .
[41] Mikko H. Lipasti,et al. Automatic abstraction and fault tolerance in cortical microachitectures , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[42] Shunfei Chen,et al. MARSS: A full system simulator for multicore x86 CPUs , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).
[43] Jagath C. Rajapakse,et al. FPGA Implementations of Neural Networks , 2006 .
[44] Mark Horowitz,et al. Energy-Efficient Floating-Point Unit Design , 2011, IEEE Transactions on Computers.
[45] Mikko H. Lipasti,et al. BenchNN: On the broad potential application scope of hardware neural network accelerators , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).
[46] K. Sankaralingam,et al. Exploring the Synergy of Emerging Workloads and Silicon Reliability Trends , 2009 .
[47] R.H. Dennard,et al. Design Of Ion-implanted MOSFET's with Very Small Physical Dimensions , 1974, Proceedings of the IEEE.
[48] I. G. Persiantsev,et al. Multifold Acceleration of Neural Network Computations Using GPU , 2009, ICANN.
[49] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[50] Vicky Wong,et al. Soft Error Resilience of Probabilistic Inference Applications , 2006 .
[51] Babak Nadjar Araabi,et al. Neural network stream processing core (NnSP) for embedded systems , 2006, 2006 IEEE International Symposium on Circuits and Systems.
[52] Karthikeyan Sankaralingam,et al. Power challenges may end the multicore era , 2013, CACM.
[53] Olivier Temam,et al. Hardware spiking neurons design: Analog or digital? , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[54] Christoforos E. Kozyrakis,et al. Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.
[55] M. Valero,et al. Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.
[56] Henry Hoffmann,et al. Quality of service profiling , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.