论文信息 - SRI International, Speech Recognition Program, Menlo Park, CA

SRI International, Speech Recognition Program, Menlo Park, CA

Objectives: SRI's Speech Research Program focuses on developing useful speech-based systems and components for enhariced and efficient communication between human users and machines. The success of such development depends on accurate modeling of human speech and language, and the application of these models in system designs. SRI's Speech Research Program, therefore, pursues both empirical and theoretic research to gain a more comprehensive understanding of the acoustic, phonological, prosodic, lexical and syntactic nature of speech and language. SRI works on system internals such as improved performance of component technologies, as well as system-level design such as appropriate architectures and human factors solutions. Our specific technical goal is the tight integration of speech recognition and natural language understanding to create real-time systems for interactive problem solving. • SRI has developed the DECIPHER speaker-independent speech recognition system, a hidden Markov model (HMM)-based system that achieves state-of-the-art recognition performance through accurate modeling of phonetic and phonological detail. SRI has completed basic studies of the range and structure of variability in the pronunciation of English. In particular, SRI has shown significant differences between read speech and the spontaneous speech observed dunng interactive problem solving. • SRI has designed a hardware architecture for real-time recognition of continuous speech for vocabularies as large as 20,000 words. SRI is currently implementing a subset of that architecture as a prototype accelerator with several special-purpose integrated circuits. • Hardware: Complete the implementation of a prototype accelerator for real-time continuous speech recognition. The accelerator will use HMMs and finite state grammars to recognize a 3000-word vocabulary. Fabricate two of these accelerators for use in related sponsored research at other sites. Begin the design and fabrication of the next generation of this hardware. • Integration of Speech Recognition and Natural Language Understanding: Develop a computationally efficient natural language parser that incrementally generates a state transition network that can be used in place of a finite state grammar in an HMM-based speech recognizer. • Spoken LanGuage System: Design and implement a spoken language interface for interactive problem solving in the domain of air travel planning. • Speech Synthesis: Improve the potential efficiency and reliability of voice displays by the synthesis of very distinct talker identities and the synthesis of audibly different urgency levels in a meaning-to-speech generation system. • N~ural-Net-Based Speech Recognition: Compare the relative effectiveness of neural net, HMM-based, and hybrid approaches for specific speech recognition processes such as feature extraction and …

Jared Bemstein | Hy Murveit