Exploiting spatio-temporal characteristics of human vision for mobile video applications

Video applications on handheld devices such as smart phones pose a significant challenge to achieve high quality user experience. Recent advances in processor and wireless networking technology are producing a new class of multimedia applications (e.g. video streaming) for mobile handheld devices. These devices are light weight and have modest sizes, and therefore very limited resources - lower processing power, smaller display resolution, lesser memory, and limited battery life as compared to desktop and laptop systems. Multimedia applications on the other hand have extensive processing requirements which make the mobile devices extremely resource hungry. In addition, the device specific properties (e.g. display screen) significantly influence the human perception of multimedia quality. In this paper we propose a saliency based framework that exploits the structure in content creation as well as the human vision system to find the salient points in the incoming bitstream and adapt it according to the target device, thus improving the quality of new adapted area around salient points. Our experimental results indicate that the adaptation process that is cognizant of video content and user preferences can produce better perceptual quality video for mobile devices. Furthermore, we demonstrated how such a framework can affect user experience on a handheld device.

[1]  Mihaela van der Schaar,et al.  Content-based selective enhancement for streaming video , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[2]  Rajeev Gandhi,et al.  Performance of MPEG-4 Profiles used for streaming video , 2001, Proceedings of Workshop and Exhibition on MPEG-4 (Cat. No.01EX511).

[3]  Wesley De Neve,et al.  A real-time content adaptation framework for exploiting ROI scalability in H.264/AVC , 2006 .

[4]  Pedro Cuenca,et al.  Very low complexity MPEG-2 to H.264 transcoding using machine learning , 2006, MM '06.

[5]  Laurent Itti,et al.  Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[6]  Wesley De Neve,et al.  Exploitation of Interactive Region of Interest Scalability in Scalable Video Coding by Using an XML-driven Adaptation Framework , 2006 .

[7]  Kang Li,et al.  FGS-MR: MPEG4 fine grained scalable multi-resolution layered video encoding , 2006, NOSSDAV '06.

[8]  Wei Liu,et al.  A FOA-based error resiliency scheme for video transmission over unreliable channels , 2005, Proceedings. 2005 International Conference on Wireless Communications, Networking and Mobile Computing, 2005..

[9]  Zhou Wang,et al.  Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..

[10]  H. Uzuner,et al.  Utilising Macroblock SKIP Mode Information to Accelerate Cropping of an H.264/AVC Encoded Video Sequence for User Centric Content Adaptation , 2007, Third International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS'07).

[11]  Alexandros Eleftheriadis,et al.  Spatio-temporal model-assisted compatible coding for low and very low bitrate videotelephony , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[12]  D. Marpe,et al.  The H . 264 / AVC Video Coding Standard , 2004 .

[13]  Ya-Qin Zhang,et al.  Transporting real-time video over the Internet: challenges and approaches , 2000, Proceedings of the IEEE.