The CSLU toolkit: rapid prototyping of spoken language systems

Research and development of spoken language systems is currently limited to relatively few academic and industrial laboratories. This is because building such systems requires multidisciplinary expertise, sophisticated development tools, specialized language resources, substantial computer resources and advanced technologiessuch as speech recognitionand textto-speech synthesis. At the Center for Spoken Language Understanding (CSLU), our mission is to make spoken language systems commonplace. To do so requires that the technology become less exclusive, more affordable and more accessible. An important step towards satisfying this goal is to place the development of spoken language systems in the hands of real domain experts rather than limit it to technical specialists. To address this problem, we have developed the CSLUToolkit, an integrated software environment for research and development of telephone-based spoken language systems (Sutton et al., 1996; Schalkwyk, et al., 1997). It is designed to support a wide range of research and development activities, including data capture and analysis, corpus development, multilingual recognition and understanding, dialogue design, speech synthesis, speaker recognition and language recognition, and systemsevaluationamongothers. Inaddition, theToolkitprovides an excellent environment for learning about spoken language technology, providingopportunitiesfor hands-on leaming, exploration and experimentation. It has been used as a basis for several short courses in which students have produced a wide range of interesting spoken language applicaPermission to nlnke digitnlhrd copies ofnll or parl ofthis mnterinl for personnl or clnssroom use is granted without fee provided that IIE copies nre not made or distributed for profit or commercial ndwmtage. Ihe copyright notice, the title of the publication and its date appear, and notice is given thnt copyright is by permission ofthe ACM, Inc. To copy olherwise, to republish. Lo post oo servers or to redistribute IO lists, requires specific permission nndlor fee UIST 97 Banfl Alberta, Canada Copyright 1997 ACM 0-89791-SSl-9!97/10..$3.50 tions, such as voice mail, airlinereservation and browsing the worldwide web by voice (Colton et al., 1996, Sutton et al., 1997). AkeymoduleoftheToolkitisagraphicalapplication-creation environment called the CSLU Rapid Prototyper (CSLUrp). This integrates state-of-the-art speaker independent and vocabulary independent technology into an easy-to-use graphical interface. It enables spoken language applications to be developed and tested, quickly and easily. Figure 1 shows a prototype application being developed using CSLUrp. The current version of CSLUrp allows for the rapid development of structured dialogues. It is designed to require minimal technical expertise on the author’s part. It provides an intuitive window-like setting, in which applications are built by placing objects onto a canvas (e.g., a telephoneanswering object, a speech recognition object, etc.) and connecting them with simple clicks of the mouse. Specifying words or phrases to be recognized by the system is a matter of simply typing them in. Similarly, specifying what the system will speak is a matter of typing or recording it. Once an application is complete, it can be run at the press of a button and interacted with either over the telephone or in desktop setting via microphone and speaker. The capability to alternate between designing and testing an application allows for incremental development and iterative refinement of systems. CSLUrp provides non-expert and even novice users with the ability to create spoken language systems for themselves. As they become more experienced and familiar with the basic capabilities, they can move beyond the scope of CSLUrp and begin to learn about and take advantage of other modules of the CSLU Toolkit.

[1]  Pieter J. E. Vermeulen,et al.  CSLUsh: an extendible research environment , 1997, EUROSPEECH.

[2]  Ronald A. Cole,et al.  Building 10,000 spoken dialogue systems , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Ronald A. Cole,et al.  Bringing spoken language systems to the classroom , 1997, EUROSPEECH.

[4]  Ronald A. Cole,et al.  A laboratory course for designing and testing spoken dialogue systems , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.