MobileASL: Intelligibility of sign language video over mobile phones

For Deaf people, access to the mobile telephone network in the United States is currently limited to text messaging, forcing communication in English as opposed to American Sign Language (ASL), the preferred language. Because ASL is a visual language, mobile video phones have the potential to give Deaf people access to real-time mobile communication in their preferred language. However, even today's best video compression techniques can not yield intelligible ASL at limited cell phone network bandwidths. Motivated by this constraint, we conducted one focus group and two user studies with members of the Deaf Community to determine the intelligibility effects of video compression techniques that exploit the visual nature of sign language. Inspired by eye tracking results that show high resolution foveal vision is maintained around the face, we studied region-of-interest encodings (where the face is encoded at higher quality) as well as reduced frame rates (where fewer, better quality, frames are displayed every second). At all bit rates studied here, participants preferred moderate quality increases in the face region, sacrificing quality in other regions. They also preferred slightly lower frame rates because they yield better quality frames for a fixed bit rate. The limited processing power of cell phones is a serious concern because a real-time video encoder and decoder will be needed. Choosing less complex settings for the encoder can reduce encoding time, but will affect video quality. We studied the intelligibility effects of this tradeoff and found that we can significantly speed up encoding time without severely affecting intelligibility. These results show promise for real-time access to the current low-bandwidth cell phone network through sign-language-specific encoding techniques.

[1]  G. Mirus,et al.  American Sign Language in virtual space: Interactions between deaf users of computer-mediated video communication and the impact of technology on language practices , 2003, Language in Society.

[2]  Kenneth E. Barner,et al.  Region of interest priority coding for sign language videoconferencing , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[3]  Cedric Nishan Canagarajah,et al.  Optimized sign language video coding based on eye-tracking analysis , 2003, Visual Communications and Image Processing.

[4]  Laura J. Muir,et al.  Perception of sign language and its application to visual communications for deaf people. , 2005, Journal of deaf studies and deaf education.

[5]  Joakim Wiklund,et al.  General packet radio service , 1999 .

[6]  Anna Cavender,et al.  PREDICTING INTELLIGIBILITY OF COMPRESSED AMERICAN SIGN LANGUAGE VIDEO WITH OBJECTIVE QUALITY METRICS , 2006 .

[7]  Christina L. James,et al.  Text input for mobile devices: comparing model prediction to actual performance , 2001, CHI.

[8]  Richard E. Ladner,et al.  Distortion-Complexity Optimization of the H.264/MPEG-4 AVC Encoder using the GBFOS Algorithm , 2007, 2007 Data Compression Conference (DCC'07).

[9]  Cheng-Chew Lim,et al.  Segmentation of the face and hands in sign language video sequences using color and motion cues , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[11]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Mark Wells,et al.  Signing for the deaf using virtual humans , 2000 .

[13]  R. Mitchell,et al.  How many deaf people are there in the United States? Estimates from the Survey of Income and Program Participation. , 2005, Journal of deaf studies and deaf education.