Crawling and Classification Strategies for Generating a Multi-Language Corpus of Sign Language Video

Although there is considerable sign language content available online, it can be hard to locate content in a specific sign language on a particular topic. The Sign Language Digital Library (SLaDL) aims to improve access through the generation of a multi-language corpus of sign language video. SLaDL uses a combination of crawling to collect potential sign language content and applying multimodal sign language detection and identification classifiers to winnow the collected videos to those believed to be in a particular sign language. Here we compare the quantity and variety of sign language videos located via breadth-first, depth-first, and focused crawling strategies. Then we examine the accuracy of different approaches to combining textual metadata and video features for the 3-way classification task of identifying videos in American Sign Language (ASL), British Sign Language (BSL), and without-sign language. Finally, due to the high computational cost of generating the video features used for classification, we explore the tradeoffs when using a cascading classifier and when generating features based on motion in sampled frames on classifier accuracy.

[1]  Frank M. Shipman,et al.  Speed-Accuracy Tradeoffs for Detecting Sign Language Content in Video Sharing Sites , 2017, ASSETS.

[2]  Frank M. Shipman,et al.  Comparing Visual, Textual, and Multimodal Features for Detecting Sign Language in Video Sharing Sites , 2018, 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR).

[3]  R. N. Whyte,et al.  Hand posture matching for Irish Sign language interpretation , 2003, ISICT.

[4]  Dimo Dimov,et al.  CBIR approach to the recognition of a sign language alphabet , 2007, CompSysTech '07.

[5]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  Manfred Georg,et al.  On using nearly-independent feature families for high precision and confidence , 2012, Machine Learning.

[8]  Marco Gori,et al.  Focused Crawling Using Context Graphs , 2000, VLDB.

[9]  Hector Garcia-Molina,et al.  Efficient Crawling Through URL Ordering , 1998, Comput. Networks.

[10]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[11]  Peng Wang,et al.  A hybrid approach to news video classification multimodal features , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[12]  Peter Wittenburg,et al.  Automatic sign language identification , 2013, 2013 IEEE International Conference on Image Processing.

[13]  Sriram Raghavan,et al.  Searching the Web , 2001, ACM Trans. Internet Techn..

[14]  Petros Maragos,et al.  Automatic sign language recognition: vision based feature extraction and probabilistic recognition scheme from multiple cues , 2008, PETRA '08.

[15]  Frank M. Shipman,et al.  Design and evaluation of classifier for identifying sign language videos in video sharing sites , 2012, ASSETS '12.

[16]  Shaogang Gong,et al.  Learning from Multiple Sources for Video Summarisation , 2015, International Journal of Computer Vision.

[17]  Wei-Hao Lin,et al.  News video classification using SVM-based multimodal classifiers and combination strategies , 2002, MULTIMEDIA '02.

[18]  Frank M. Shipman,et al.  Detection of sign-language content in video through polar motion profiles , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Keith B. Hall,et al.  Improved video categorization from text metadata and user comments , 2011, SIGIR '11.

[20]  Frank M. Shipman,et al.  Towards a Distributed Digital Library for Sign Language Content , 2015, JCDL.

[21]  Yoelle Maarek,et al.  The Shark-Search Algorithm. An Application: Tailored Web Site Mapping , 1998, Comput. Networks.

[22]  Vassilis Athitsos,et al.  Nearest neighbor search methods for handshape recognition , 2008, PETRA '08.

[23]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[24]  Frank M. Shipman,et al.  Detecting and Identifying Sign Languages through Visual Features , 2016, 2016 IEEE International Symposium on Multimedia (ISM).

[25]  David A. Shamma,et al.  Knowing funny: genre perception and categorization in social video sharing , 2011, CHI.

[26]  Richard E. Ladner,et al.  Activity detection in conversational sign language video for mobile telecommunication , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[27]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Z. Zivkovic Improved adaptive Gaussian mixture model for background subtraction , 2004, ICPR 2004.

[29]  Ashish Sureka,et al.  A focused crawler for mining hate and extremism promoting videos on YouTube. , 2014, HT.

[30]  Petros Maragos,et al.  Sign Language Recognition, Generation, and Modelling: A Research Effort with Applications in Deaf Communication , 2009, HCI.

[31]  Jose L. Hernandez-Rebollar Gesture-driven American sign language phraselator , 2005, ICMI '05.

[32]  Peter Wittenburg,et al.  Unsupervised Feature Learning for Visual Sign Language Identification , 2014, ACL.