Performance Impact of Unaligned Memory Operations in SIMD Extensions for Video Codec Applications
暂无分享,去创建一个
[1] Emmett Witchel,et al. Techniques for Increasing and Detecting Memory Alignment , 2001 .
[2] Vladimir M. Pentkovski,et al. Implementing Streaming SIMD Extensions on the Pentium III Processor , 2000, IEEE Micro.
[3] Alan Jay Smith,et al. Measuring the Performance of Multimedia Instruction Sets , 2002, IEEE Trans. Computers.
[4] Philippe Roussel,et al. The microarchitecture of the intel pentium 4 processor on 90nm technology , 2004 .
[5] Eric Rotenberg,et al. Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[6] Balaram Sinharoy,et al. POWER4 system microarchitecture , 2002, IBM J. Res. Dev..
[7] Pradeep K. Dubey,et al. How Multimedia Workloads Will Change Processor Design , 1997, Computer.
[8] Andreas Krall,et al. Compilation Techniques for Multimedia Processors , 2004, International Journal of Parallel Programming.
[9] Faouzi Kossentini,et al. H.264/AVC baseline profile decoder complexity analysis , 2003, IEEE Trans. Circuits Syst. Video Technol..
[10] Ruby B. Lee,et al. Challenges to Combining General-Purpose and Multimedia Processors , 1997, Computer.
[11] Shreekant S. Thakkar,et al. Internet Streaming SIMD Extensions , 1999, Computer.
[12] Yen-Kuang Chen,et al. Implementation of H.264 decoder on general-purpose processors with media instructions , 2003, IS&T/SPIE Electronic Imaging.
[13] Burzin A. Patel,et al. Optimization of instruction fetch mechanisms for high issue rates , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[14] Stamatis Vassiliadis,et al. The TM3270 media-processor , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).
[15] Iain E. G. Richardson,et al. Video Codec Design: Developing Image and Video Compression Systems , 2002 .
[16] Jose Fridman. Data alignment for sub-word parallelism in DSP , 1999, 1999 IEEE Workshop on Signal Processing Systems. SiPS 99. Design and Implementation (Cat. No.99TH8461).
[17] Mayan Moudgill,et al. Environment for PowerPC microarchitecture exploration , 1999, IEEE Micro.
[18] Richard Henderson,et al. Multi-platform auto-vectorization , 2006, International Symposium on Code Generation and Optimization (CGO'06).
[19] Mateo Valero,et al. Adding a vector unit to a superscalar processor , 1999, ICS '99.
[20] Peng Wu,et al. Vectorization for SIMD architectures with alignment constraints , 2004, PLDI '04.
[21] Hunter Scales,et al. AltiVec Extension to PowerPC Accelerates Media Processing , 2000, IEEE Micro.
[22] D. Marpe,et al. Video coding with H.264/AVC: tools, performance, and complexity , 2004, IEEE Circuits and Systems Magazine.