Mapping of application software to the multimedia instructions of general-purpose microprocessors

This paper describes how media processing programs may be accelerated by using the multimedia instruction extensions that have been added to general-purpose microprocessors. As a concrete example, it describes MAX2, a minimalist, second- generation set of multimedia instructions included in the PA-RISC 2.0 processor architecture. MAX2 implements subword parallel instructions, which utilize the microprocessor's 64-bit wide data paths to process multiple pieces of lower- precision data in parallel. It also includes innovative, new instructions like Mix, which are very useful for matrix transpose and other common data rearrangements. The paper examines some typical multimedia kernels, like block match, matrix transpose, box filter and the IDCT, coded with and without the MAX2 instructions, to illustrate programming techniques for exploiting subword parallelism and superscalar instruction parallelism. The kernels using MAX2 show significant speedups in execution time, and more efficient utilization of the processor's resources.

[1]  Craig Hansen MicroUnity's MediaProcessor architecture , 1996, IEEE Micro.

[2]  Ruby B. Lee Realtime MPEG video via software decompression on a PA-RISC processor , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[3]  D. Morris,et al.  Pathlength reduction features in the PA-RISC architecture , 1992, Digest of Papers COMPCON Spring 1992.

[4]  Ruby B. Lee,et al.  64-bit and multimedia extensions in the PA-RISC 2.0 architecture , 1996, COMPCON '96. Technologies for the Information Superhighway Digest of Papers.

[5]  Ruby B. Lee Precision architecture , 1989, Computer.

[6]  Ruby B. Lee Accelerating multimedia with enhanced microprocessors , 1995, IEEE Micro.

[7]  Ruby B. Lee Subword parallelism with MAX-2 , 1996, IEEE Micro.

[8]  Y. Arai,et al.  A Fast DCT-SQ Scheme for Images , 1988 .

[9]  Uri C. Weiser,et al.  MMX technology extension to the Intel architecture , 1996, IEEE Micro.

[10]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[11]  Marc Tremblay,et al.  VIS speeds new media processing , 1996, IEEE Micro.

[12]  Doug Hunt,et al.  Advanced performance features of the 64-bit PA-8000 , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[13]  Marc Tremblay,et al.  The visual instruction set (VIS) in UltraSPARC , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.

[14]  Jeremiah Golston Single-chip H.324 videoconferencing , 1996, IEEE Micro.

[15]  Ruby B. Lee,et al.  Pathlengths of SPEC benchmarks for PA-RISC, MIPS, and SPARC , 1993, Digest of Papers. Compcon Spring.