Architectural advances in the VLSI implementation of arithmetic coding for binary image compression

This paper presents some recent advances in the architecture for the data compression technique known as arithmetic coding. The new architecture employs loop unrolling and speculative execution of the inner loop of the algorithm to achieve a significant speed-up relative to the Q-coder architecture. This approach reduces the number of iterations required to compress a block of data by a factor that is on the order of the compression ratio. While the speed-up technique has been previously discovered independently by researchers at IBM, no systematic study of the architectural trade-offs has ever been published. For the CCITT facsimile documents, the new architecture achieves a speed-up of approximately seven compared to the IBM Q-coder when four lookahead units are employed in parallel. A structure for fast input/output processing based on run length pre-coding of the data stream to accompany the new architecture is also presented.<<ETX>>