论文信息 - Region-of-Interest Based Video Image Transcoding for Heterogenous Client Displays

Region-of-Interest Based Video Image Transcoding for Heterogenous Client Displays

Problem: It is a formidable problem to broadcast quality video to individually suit the variety of clients’ display capabilities. As a solution to the problem, we present an algorithm for adapting video image size to multiple heterogeneous clients while maximizing information content in the transcoded image. Solution: If an image is decimated, some sub-images will have higher visual attention (VA) value than others. Studies show that most useful information is contained in areas with high VA. Given an image, VA of individual sub-images can be determined using any of several existing methods. However, the VA value for individual image-blocks are assigned over range 0.0 to 1.0, depending on the importance of the block’s contents. Then, given the quantitative VA values of all blocks in the image, the Region-of-Interest (cropped sub-image) will be chosen to encompass as many regions of high VA values as possible, thereby maximizing the information content (VA value) of the final image. This is the crux of our algorithm. Results: A 512x512 gray scale image was considered for transcoding for five types of client displays:. workstation (256x256), desktop (192x192), TV-browser (128x128), hand-held (96x96), and personal digital assistant (PDA) (64x64). For simplicity, the aspect ratios of all displays was assumed to be 1.00. The 512x512 image was divided into 64 sub-images (8-rows, 8-columns). Each 64x64 block was assigned a VA value. The new transcoding algorithm was applied to get the final image for each display size. Results showed that in the workstation image, most important regions of the original image were preserved while containing significant global and local information. In the PDA image, although significant global information was not preservable, the algorithm had retained the best local information under the display-size and compression-ratio constraints. Conclusions: Results show that region-of-interest based cropping, with maximized visual attention value, is the most natural method for cropping an image for transcoding for display on heterogenous clients.

Karun B. Shimoga

[1] John R. Smith,et al. Transcoding Internet content for heterogeneous client devices , 1998, ISCAS '98. Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (Cat. No.98CH36187).

[2] Claudio M. Privitera,et al. Focused JPEG encoding based upon automatic preidentified regions of interest , 1999, Electronic Imaging.

[3] F. Pellandini,et al. Adaptive color image compression based on visual attention , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[4] John R. Smith,et al. Content-based transcoding of images in the Internet , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[5] John R. Smith,et al. Multimedia content description in the InfoPyramid , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6] Keansub Lee,et al. Perception-based image transcoding for universal multimedia access , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).