Towards Neural Acceleration for General-Purpose Approximate Computing
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[2] Lawrence D. Jackel,et al. An analog neural network processor with programmable topology , 1991 .
[3] Michael D. Smith,et al. A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[4] Richard E. Kessler,et al. The Alpha 21264 microprocessor , 1999, IEEE Micro.
[5] Jihan Zhu,et al. FPGA Implementations of Neural Networks - A Survey of a Decade of Progress , 2003, FPL.
[6] Scott A. Mahlke,et al. Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).
[7] M. Valero,et al. Fuzzy memoization for floating-point multimedia applications , 2005, IEEE Transactions on Computers.
[8] Donald Yeung,et al. Exploiting Soft Computing for Increased Fault Tolerance , 2006 .
[9] Krishna V. Palem,et al. Ultra-Efficient (Embedded) SOC Architectures based on Probabilistic CMOS (PCMOS) Technology , 2006, Proceedings of the Design Automation & Test in Europe Conference.
[10] Babak Nadjar Araabi,et al. Neural network stream processing core (NnSP) for embedded systems , 2006, 2006 IEEE International Symposium on Circuits and Systems.
[11] David V. Anderson,et al. An Analog Programmable Multidimensional Radial Basis Function Based Classifier , 2007, IEEE Transactions on Circuits and Systems I: Regular Papers.
[12] Norman P. Jouppi,et al. Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[13] Donald Yeung,et al. Exploiting Application-Level Correctness for Low-Cost Fault Tolerance , 2008, J. Instr. Level Parallelism.
[14] Scott A. Mahlke,et al. Bridging the computation gap between programmable processors and hardwired accelerators , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.
[15] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Karthikeyan Sankaralingam,et al. Relax: an architectural framework for software recovery of hardware faults , 2010, ISCA.
[17] Douglas L. Jones,et al. Scalable stochastic processors , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).
[18] Woongki Baek,et al. Green: a framework for supporting energy-conscious programming using controlled approximation , 2010, PLDI '10.
[19] Quinn Jacobson,et al. ERSA: error resilient system architecture for probabilistic applications , 2010, DATE 2010.
[20] Steven Swanson,et al. QSCORES: Trading dark silicon for scalable energy efficiency with quasi-specific cores , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[21] Dan Grossman,et al. EnerJ: approximate data types for safe and general low-power computation , 2011, PLDI '11.
[22] Henry Hoffmann,et al. Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.
[23] Mikko H. Lipasti,et al. A case for neuromorphic ISAs , 2011, ASPLOS XVI.
[24] Song Liu,et al. Flikker: saving DRAM refresh-power through critical data partitioning , 2011, ASPLOS XVI.
[25] Mark Horowitz,et al. Energy-Efficient Floating-Point Unit Design , 2011, IEEE Transactions on Computers.
[26] Luis Ceze,et al. Architecture support for disciplined approximate programming , 2012, ASPLOS XVII.
[27] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.