Linear Image Processing Operations With Operational Tight Packing

Computer hardware with native support for large-bitwidth operations can be used for the concurrent calculation of multiple independent linear image processing operations when these operations map integers to integers. This is achieved by packing multiple input samples in one large-bitwidth number, performing a single operation with that number and unpacking the results. We propose an operational framework for tight packing, i.e., achieve the maximum packing possible by a certain implementation. We validate our framework on floating-point units natively supported in mainstream programmable processors. For image processing tasks where operational tight packing leads to increased packing in comparison to previously-known operational packing, the processing throughput is increased by up to 25%.

[1]  James D. Allen An approach to fast transform coding in software , 1996, Signal Process. Image Commun..

[2]  Alexander Kadyrov,et al.  The "Invaders' Algorithm: Range of Values Modulation for Accelerated Correlation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bo Zhang,et al.  Packed integer wavelet transform constructed by lifting scheme , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  Yiannis Andreopoulos,et al.  Software designs of image processing tasks with incremental refinement of computation , 2009, 2009 IEEE Workshop on Signal Processing Systems.