MacSim: A MAC-Enabled High-Performance Low-Power SIMD Architecture
暂无分享,去创建一个
[1] Yifan He,et al. Xetal-Pro: An ultra-low energy and high throughput SIMD processor , 2010, Design Automation Conference.
[2] Scott A. Mahlke,et al. AnySP: Anytime Anywhere Anyway Signal Processing , 2010, IEEE Micro.
[3] Michael E. Wolf,et al. Improving locality and parallelism in nested loops , 1992 .
[4] 스트라즈더스스티븐,et al. Multiply-accumulate (mac) unit for single-instruction/multiple-data (simd) instructions , 2002 .
[5] Marta Jiménez,et al. Register tiling in nonrectangular iteration spaces , 2002, TOPL.
[6] Yuyun Liao,et al. A high-performance and low-power 32-bit multiply-accumulate unit with single-instruction-multiple-data (SIMD) feature , 2002, IEEE J. Solid State Circuits.
[7] Kalyan Mondal,et al. Compact carry-save multiplier architecture and its applications , 1999 .
[8] Michael Wolfe,et al. More iteration space tiling , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).
[9] Charles Roth,et al. A low-power, high-speed implementation of a PowerPC/sup TM/ microprocessor vector extension , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).
[10] Yukio Sugeno,et al. A Multiplier-Accumulator Macro for a 45 MIPS Embedded RISC Processor , 1995, ESSCIRC '95: Twenty-first European Solid-State Circuits Conference.
[11] Kalyan Mondal,et al. A compact carry-save multiplier architecture and its applications , 1997, Proceedings of 40th Midwest Symposium on Circuits and Systems. Dedicated to the Memory of Professor Mac Van Valkenburg.
[12] F. Elguibaly,et al. A fast parallel multiplier-accumulator using the modified Booth algorithm , 2000 .
[13] Magdy A. Bayoumi,et al. High Speed and Area-Efficient Multiply Accumulate (MAC) Unit for Digital Signal Prossing Applications , 2007, 2007 IEEE International Symposium on Circuits and Systems.
[14] Henk Corporaal,et al. Speed sign detection and recognition by convolutional neural networks , 2011 .
[15] Subhadeep Roy. A sub-word-parallel Galois field multiply-accumulate unit for digital signal processors , 2005, 2005 IEEE International Symposium on Circuits and Systems.
[16] Monica S. Lam,et al. A data locality optimizing algorithm , 1991, PLDI '91.
[17] Yifan He,et al. SIMD made explicit , 2013, 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).
[18] Scott A. Mahlke,et al. MacroSS: macro-SIMDization of streaming applications , 2010, ASPLOS XV.
[19] Jingling Xue,et al. Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.