MMX-based DCT and MC algorithms for real-time pure software MPEG decoding

To overcome the difficulties of computation-intensive multimedia applications, the development groups of major CPU manufacturers, such as Intel/sup TM/ and Digital/sup TM/, have decided to include new instruction sets into their CPU families to increase their multimedia handling ability. The newly introduced instruction set is basically in a Single Instruction Multiple Data (SIMD) Stream operation type. For practical purposes (e.g, the trade off between the complexity of hardware implementation and the so-obtained performance improvement), they use a reduced SIMD instruction set instead of the full one. Taking Intel as an example, the new instruction set is composed of 57 operations called the MultiMedia eXtension (MMX) instruction set. Nowadays, how to fully utilize the power of the embedded instruction set for providing various multimedia applications becomes an interesting and important issue. We demonstrate an efficient realization, based on the new MMX instruction set of the block Inverse Discrete Cosine Transform (IDCT) and Motion Compensation (MC) which are kernel components of the block based decoding standards, such as MPEG-1, MPEG-2, H.261 and H.263. The convincing results show that: with the addition of the proper SIMD instruction set, the pure software solution for complicated multimedia applications (such as real time MPEG video decoding) becomes feasible.