Nasal sounds generation and pitch control for the real-time hand to speech system

When individuals with speaking disabilities, dysarthrics, try to communicate using speech, they often have to use speech synthesizers which require them to type word symbols or sound symbols. This input method often makes realtime operations difficult and dysarthric users fail to control the flow of conversation. In this study, we are developing a new and novel speech synthesizer where not symbol inputs but hand motions are used to generate speech. In recent years, statistical voice conversion techniques have been proposed based on space mapping between given parallel data sequences. By applying these methods, a hand space and a vowel space is mapped and a converter from hand motions to vowel transitions is developed. It has been reported that the proposed method is effective enough to generate Japanese five vowels. In this paper, we discuss expansion of this system to consonant generation and pitch control. For the former, two methods are examined: waveform concatenation and space mapping for consonant sounds are discussed. For the latter, pitch control is realized using posture of the arm measured by a magnetic sensor.