Quantifying the effect of disruptions to temporal coherence on the intelligibility of compressed American Sign Language video

Communication of American Sign Language (ASL) over mobile phones would be very beneficial to the Deaf community. ASL video encoded to achieve the rates provided by current cellular networks must be heavily compressed and appropriate assessment techniques are required to analyze the intelligibility of the compressed video. As an extension to a purely spatial measure of intelligibility, this paper quantifies the effect of temporal compression artifacts on sign language intelligibility. These artifacts can be the result of motion-compensation errors that distract the observer or frame rate reductions. They reduce the the perception of smooth motion and disrupt the temporal coherence of the video. Motion-compensation errors that affect temporal coherence are identified by measuring the block-level correlation between co-located macroblocks in adjacent frames. The impact of frame rate reductions was quantified through experimental testing. A subjective study was performed in which fluent ASL participants rated the intelligibility of sequences encoded at a range of 5 different frame rates and with 3 different levels of distortion. The subjective data is used to parameterize an objective intelligibility measure which is highly correlated with subjective ratings at multiple frame rates.

[1]  G Sperling,et al.  Intelligent temporal subsampling of American Sign Language using event boundaries. , 1990, Journal of experimental psychology. Human perception and performance.

[2]  R.A. Foulds,et al.  Biomechanical and perceptual constraints on the bandwidth requirements of sign language , 2004, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[3]  W. Stokoe Sign Language Structure , 1980 .

[4]  Ryota Kanai,et al.  Perceiving the Present and a Systematization of Illusions , 2008, Cogn. Sci..

[5]  Frank M. Ciaramello,et al.  Complexity constrained rate-distortion optimization of sign language video using an objective intelligibility metric , 2008, Electronic Imaging.

[6]  Frank M. Ciaramello,et al.  "Can you see me now?" An objective metric for predicting intelligibility of compressed American Sign Language video , 2007, Electronic Imaging.

[7]  Simon Charles Susan George Hooper,et al.  The Effects of Digital Video Quality on Learner Comprehension in an American Sign Language Assessment Environment , 2007 .

[8]  Patricia Siple,et al.  Understanding language through sign language research , 1978 .

[9]  Zhou Wang,et al.  Foveation scalable video coding with automatic fixation selection , 2003, IEEE Trans. Image Process..

[10]  Scott K. Liddell,et al.  American Sign Language: The Phonological Base , 2013 .

[11]  Richard E. Ladner,et al.  MobileASL:: intelligibility of sign language video as constrained by mobile phone technology , 2006, Assets '06.

[12]  A. Bovik,et al.  AN INFORMATION THEORETIC VIDEO QUALITY METRIC BASED ON MOTION MODELS , 2007 .

[13]  Richard E. Ladner,et al.  Variable frame rate for low power mobile sign language communication , 2007, Assets '07.

[14]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.