While many organizations provide a website in multiple languages, few provide a sign-language version for deaf users, many of whom have lower written-language literacy. Rather than providing difficult-to-update videos of humans, a more practical solution would be for the organization to specify a script (representing the sequence of words) to generate a sign-language animation. The challenge is we must select the accurate speed and timing of signs. In this work, focused on American Sign Language (ASL), motion-capture data recorded from humans is used to train machine learning models to calculate realistic timing for ASL animation movement, with an initial focus on inserting prosodic breaks (pauses), adjusting the pause durations for these pauses, and adjusting differentials signing rate for ASL animations based on the sentence syntax and other features. The methodology includes processing and cleaning data from an ASL corpus with motion-capture recordings, selecting features, and building machine learning models to predict where to insert pauses, length of pauses, and signing speed. The resulting models were evaluated using a cross-validation approach to train and test multiple models on various partitions of the dataset, to compare various learning algorithms and subsets of features. In addition, a user-based evaluation was conducted in which native ASL signers evaluated animations generated based on these models. This paper summarizes the motivations for this work, proposed solution, and the potential contribution of this work. This paper describes both completed work and some additional future research plans.
[1]
Harlan Lane,et al.
The patterns of silence: Performance structures in sentence production
,
1979,
Cognitive Psychology.
[2]
Barbora Hladká,et al.
A Gentle Introduction to Machine Learning for Natural Language Processing: How to Start in 16 Practical Steps
,
2015,
Lang. Linguistics Compass.
[3]
Larwan Berke,et al.
Modeling the Speed and Timing of American Sign Language to Generate Realistic Animations
,
2018,
ASSETS.
[4]
John Glauert,et al.
Requirements For A Signing Avatar
,
2010
.
[5]
Matt Huenerfauth,et al.
Evaluation of a psycholinguistically motivated timing model for animations of american sign language
,
2008,
Assets '08.
[6]
Annelies Braffort,et al.
Toward the Study of Sign Language Coarticulation: Methodology Proposal
,
2009,
2009 Second International Conferences on Advances in Computer-Human Interactions.
[7]
Matt Huenerfauth,et al.
Best practices for conducting evaluations of sign language animation
,
2015
.
[8]
Alan W. Black,et al.
Data-driven phrasing for speech synthesis in low-resource languages
,
2012,
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9]
John R. W. Glauert,et al.
Providing signed content on the Internet by synthesized animation
,
2007,
TCHI.