Evaluating a Dynamic Time Warping Based Scoring Algorithm for Facial Expressions in ASL Animations

Advancing the automatic synthesis of linguistically accurate and natural-looking American Sign Language (ASL) animations from an easy-to-update script would increase information accessibility for many people who are deaf by facilitating more ASL content to websites and media. We are investigating the production of ASL grammatical facial expressions and head movements coordinated with the manual signs that are crucial for the interpretation of signed sentences. It would be useful for researchers to have an automatic scoring algorithm that could be used to rate the similarity of two animation sequences of ASL facial movements (or an animation sequence and a motioncapture recording of a human signer). We present a novel, sign-language specific similarity scoring algorithm, based on Dynamic Time Warping (DTW), for facial expression performances and the results of a user-study in which the predictions of this algorithm were compared to the judgments of ASL signers. We found that our algorithm had significant correlations with participants’ comprehension scores for the animations and the degree to which they reported noticing specific facial expressions.

[1]  Ivan Kraljevski,et al.  Perceived Speech Quality Estimation Using DTW Algorithm , 2008 .

[2]  Dimitrios Gunopulos,et al.  Indexing Multidimensional Time-Series , 2004, The VLDB Journal.

[3]  T. Pejsa,et al.  Architecture of an animation system for human characters , 2009, 2009 10th International Conference on Telecommunications.

[4]  GibetSylvie,et al.  The SignCom system for data-driven animation of interactive virtual signers , 2011 .

[5]  Charlotte Lee Baker-Shenk,et al.  American Sign Language : A Teacher's Resource Text on Grammar and Culture , 1991 .

[6]  R. Mitchell,et al.  How Many People Use ASL in the United States? Why Estimates Need Updating , 2006 .

[7]  Hernisa Kacorri Models of linguistic facial expressions for American Sign Language animation , 2013, ASAC.

[8]  Zhiwei Zhu,et al.  Dynamic Facial Expression Analysis and Synthesis With MPEG-4 Facial Animation Parameters , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Toni Giorgino,et al.  Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package , 2009 .

[10]  Alexis Héloir,et al.  Sign Language Avatars: Animation and Comprehensibility , 2011, IVA.

[11]  Hermann Ney,et al.  Enhancing gloss-based corpora with facial features using active appearance models , 2013 .

[12]  C. B. Traxler,et al.  The Stanford Achievement Test, 9th Edition: National Norming and Performance Standards for Deaf and Hard-of-Hearing Students. , 2000, Journal of deaf studies and deaf education.

[13]  Milos Cernak,et al.  An Evaluation of Synthetic Speech Using the PESQ Measure , 2005 .

[14]  Nicolas Courty,et al.  The SignCom system for data-driven animation of interactive virtual signers , 2011, ACM Trans. Interact. Intell. Syst..

[15]  Matt Huenerfauth,et al.  Evaluation of American Sign Language Generation by Native ASL Signers , 2008, TACC.

[16]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[17]  Hernisa Kacorri,et al.  TR-2015001: A Survey and Critique of Facial Expression Synthesis in Sign Language Animation , 2015 .

[18]  Ming Ouhyoung,et al.  Unconventional approaches for facial animation and tracking , 2012, SIGGRAPH Asia Technical Briefs.

[19]  Alexis Héloir,et al.  REAL-TIME ANIMATION OF INTERACTIVE AGENTS: SPECIFICATION AND REALIZATION , 2010, Appl. Artif. Intell..

[20]  Jovan Popovic,et al.  Guided time warping for motion editing , 2007, SCA '07.

[21]  Nadia Mana,et al.  HMM-based synthesis of emotional facial expressions during speech in synthetic talking heads , 2006, ICMI '06.

[22]  Matt Huenerfauth,et al.  Evaluating Facial Expressions in American Sign Language Animations for Accessible Online Information , 2013, HCI.

[23]  Fernando De la Torre,et al.  Canonical Time Warping for Alignment of Human Behavior , 2009, NIPS.

[24]  N. G. Zagoruyko,et al.  Automatic recognition of 200 words , 1970 .

[25]  Matej Rojc,et al.  Towards ECA's Animation of Expressive Complex Behaviour , 2010, COST 2102 Conference.

[26]  Vldb Endowment,et al.  The VLDB journal : the international journal on very large data bases. , 1992 .

[27]  Matt Huenerfauth,et al.  Best practices for conducting evaluations of sign language animation , 2015 .