A specialized low-cost vectorized loop buffer for embedded processors
暂无分享,去创建一个
Li Shen | Cong Liu | Nong Xiao | Zhiying Wang | Libo Huang | Hongyi Lu
[1] Martin Hopkins,et al. Synergistic Processing in Cell's Multicore Architecture , 2006, IEEE Micro.
[2] Ibrahim N. Hajj,et al. Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).
[3] James E. Smith,et al. Vector instruction set support for conditional operations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[4] William H. Mangione-Smith,et al. The filter cache: an energy efficient memory structure , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[5] Wei Shi,et al. SIF: Overcoming the limitations of SIMD devices via implicit permutation , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[6] Ayal Zaks,et al. Vectorizing for a SIMdD DSP architecture , 2003, CASES '03.
[7] Alan Jay Smith,et al. Measuring the Performance of Multimedia Instruction Sets , 2002, IEEE Trans. Computers.
[8] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[9] Lizy Kurian John,et al. Cost-effective hardware acceleration of multimedia applications , 2001, Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001.
[10] Raminder Singh Bajwa,et al. Instruction buffering to reduce power in processors for signal processing , 1997, IEEE Trans. Very Large Scale Integr. Syst..
[11] Sanjive Agarwala,et al. Effective hardware-based two-way loop cache for high performance low power processors , 2000, Proceedings 2000 International Conference on Computer Design.