EMEURO: A framework for generating multi-purpose accelerators via deep learning
暂无分享,去创建一个
[1] Henry Hoffmann,et al. Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.
[2] Martin Rinard,et al. Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .
[3] Kunle Olukotun,et al. The Future of Microprocessors , 2005, ACM Queue.
[4] Martin C. Rinard. Using early phase termination to eliminate load imbalances at barrier synchronization points , 2007, OOPSLA.
[5] Johannes Schemmel,et al. Wafer-scale integration of analog neural networks , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[6] Jacob Nelson,et al. SNNAP: Approximate computing on programmable SoCs via neural acceleration , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[7] Christoforos E. Kozyrakis,et al. Understanding sources of inefficiency in general-purpose chips , 2010, ISCA.
[8] Scott A. Mahlke,et al. Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.
[9] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[10] Jacob Nelson,et al. Approximate storage in solid-state memories , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[11] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2014, IEEE Micro.
[12] Luis Ceze,et al. Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.
[13] Scott A. Mahlke,et al. SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[14] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .
[15] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[16] Kunle Olukotun,et al. A highly scalable Restricted Boltzmann Machine FPGA implementation , 2009, 2009 International Conference on Field Programmable Logic and Applications.
[17] Krishna V. Palem,et al. Ultra-Efficient (Embedded) SOC Architectures based on Probabilistic CMOS (PCMOS) Technology , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[18] Olivier Temam,et al. A defect-tolerant accelerator for emerging high-performance applications , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[19] Dhananjay S. Phatak,et al. Investigating the Fault Tolerance of Neural Networks , 2005, Neural Computation.
[20] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[21] Alan Edelman,et al. PetaBricks: a language and compiler for algorithmic choice , 2009, PLDI '09.
[22] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[23] Alan Edelman,et al. Language and compiler support for auto-tuning variable-accuracy algorithms , 2011, International Symposium on Code Generation and Optimization (CGO 2011).
[24] Martin C. Rinard. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks , 2006, ICS '06.
[25] Jie Han,et al. Approximate computing: An emerging paradigm for energy-efficient design , 2013, 2013 18th IEEE European Test Symposium (ETS).
[26] Dan Grossman,et al. EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.
[27] I. G. Persiantsev,et al. Multifold Acceleration of Neural Network Computations Using GPU , 2009, ICANN.
[28] Mikko H. Lipasti,et al. BenchNN: On the broad potential application scope of hardware neural network accelerators , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).
[29] Henry Hoffmann,et al. Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.