论文信息 - A framework and toolkit for the construction of multimodal learning interfaces

A framework and toolkit for the construction of multimodal learning interfaces

Abstract : This dissertation contributes in three main areas: (1) theory of multimodal interaction, (2) software architecture and reusable application framework, and (3) rapid application prototyping by domain specific instantiation of a common underlying architecture. The foundation of the application framework and the rapid prototyping tools is a model of multimodal interpretation based on semantic integration of information streams. This model supports most of the conceivable human communication modalities in the context of a broad class of applications, specifically those that support state manipulation via parameterized actions. The multimodal semantic model is also the basis for a flexible, domain independent, incrementally trainable multimodal interpretation algorithm based on a connectionist network. The second major contribution is an application framework consisting of reusable components and a modular, distributed system architecture. Multimodal application developers can assemble the components in the framework into a new application, accepting default options when appropriate and providing application specific customizations when needed. The third major contribution is a design process backed by a workbench of tools to permit the rapid prototyping of a multimodal application. This design process systematically constructs customizations needed to interpret multimodal inputs in a given domain, allowing an application structure created in the proposed framework to be instantiated for that domain. The application framework and design process have been successfully applied to the construction of three multimodal systems in three different domains.

Alex Waibel | Minh Tue Vo | A. Waibel | M. Vo

[1] A Waibel Co-Advisor,et al. Fast Speaker Independent Large Vocabulary Continuous Speech Recognition , 1998 .

[2] Nelson M. Blachman,et al. The amount of information that y gives about X , 1968, IEEE Trans. Inf. Theory.

[3] P. L. Jackson. The Theoretical Minimal Unit for Visual Speech Perception: Visemes and Coarticulation. , 1988 .

[4] Alan C. Shaw,et al. Parsing of Graph-Representable Pictures , 1970, JACM.

[5] Alexander H. Waibel,et al. Multi-State Time Delay Networks for Continuous Speech Recognition , 1991, NIPS.

[6] Hiroyuki Kamio,et al. A multimodal, keyword-based spoken dialogue system-MultiksDial , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[7] Adam Cheyer,et al. MVIEWS: multimodal tools for the video analyst , 1998, IUI '98.

[8] Ephraim P. Glinert,et al. Multimodal Integration , 1996, IEEE Multim..

[9] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.

[10] Alexander H. Waibel,et al. Improving connected letter recognition by lipreading , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] Laura G. Miller,et al. Structured Networks for Adaptive Language Acquisition , 1993, Int. J. Pattern Recognit. Artif. Intell..

[12] Luc E. Julia,et al. Pattern recognition and beautification for a pen based interface , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[13] Adam Cheyer,et al. Multimodal Maps: An Agent-Based Approach , 1995, Multimodal Human-Computer Communication.

[14] Richard A. Bolt,et al. “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[15] Hermann Ney,et al. Dynamic programming speech recognition using a context-free grammar , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16] Grady Booch,et al. Object-oriented analysis and design with applications (2nd ed.) , 1993 .

[17] Hiroki Arakawa. On-line recognition of handwritten characters - alphanumerics, Hiragana, Katakana, Kanji , 1983, Pattern Recognit..

[18] Hideo Shimazu,et al. Multi-Modal Definite Clause Grammar , 1994, COLING.

[19] Donald E. Knuth. The art of computer programming: fundamental algorithms , 1969 .

[20] Alex Waibel,et al. Large vocabulary recognition using linked predictive neural networks , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[21] Hermann Ney,et al. The use of a one-stage dynamic programming algorithm for connected word recognition , 1984 .

[22] Wolfgang Wahlster,et al. User and discourse models for multimodal communication , 1991 .

[23] James L. McClelland,et al. Sentence comprehension: A parallel distributed processing approach , 1989, Language and Cognitive Processes.

[24] Daniel P. Siewiorek,et al. Matching interface design with user tasks. Modalities of interaction with CMU wearable computers , 1996, IEEE Wirel. Commun..

[25] Klaus Ries,et al. The Karlsruhe-Verbmobil speech recognition engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26] Lee D. Erman,et al. The Hearsay-I Speech Understanding System: An Example of the Recognition Process , 1973, IEEE Transactions on Computers.

[27] James A. Landay,et al. Extending an existing user interface toolkit to support gesture recognition , 1993, INTERCHI Adjunct Proceedings.

[28] Johannes Müller,et al. An efficient top-down parsing algorithm for understanding speech by using stochastic syntactic and semantic models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[29] T. Kohonen,et al. Statistical pattern recognition with neural networks: benchmarking studies , 1988, IEEE 1988 International Conference on Neural Networks.

[30] S. Joy Mountford,et al. The Art of Human-Computer Interface Design , 1990 .

[31] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[32] Dean Rubine,et al. The automatic recognition of gestures , 1992 .

[33] Ivar Jacobson,et al. Object-oriented software engineering - a use case driven approach , 1993, TOOLS.

[34] Zunaid Kazi,et al. Gesture-speech based HMI for a rehabilitation robot , 1996, Proceedings of SOUTHEASTCON '96.

[35] Kiyohiro Shikano,et al. Modularity and scaling in large phonemic neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[36] Kristinn R. Thórisson,et al. Integrating Simultaneous Input from Speech, Gaze, and Hand Gestures , 1991, AAAI Workshop on Intelligent Multimedia Interfaces.

[37] Margaret Minsky,et al. Manipulating simulated objects with real-world gestures using a force and position sensitive screen , 1984, SIGGRAPH.

[38] Geoffrey E. Hinton,et al. Building adaptive interfaces with neural networks: The glove-talk pilot study , 1990, INTERACT.

[39] James Gips,et al. EagleEyes: Eye Controlled Multimedia (Video). , 1995, MM 1995.

[40] James D. Hollan,et al. An introduction to HITS: Human Interface Tool Suite , 1991 .

[41] Antonella De Angeli,et al. Integration and synchronization of input modes during multimodal human-computer interaction , 1997, CHI.

[42] Henk Zeevat,et al. Integrating natural language and graphics in dialogue , 1990, INTERACT.

[43] Alex Waibel,et al. JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[44] Kendall Scott,et al. UML distilled - applying the standard object modeling language , 1997 .

[45] Sriganesh Madhvanath. The holistic paradigm in handwritten word recognition and its application to large and dynamic lexicon scenarios , 1998 .

[46] Donald E. Knuth,et al. The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[47] Ching Y. Suen,et al. The State of the Art in Online Handwriting Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[48] Guy Lorette,et al. A genetic algorithm for on-line cursive handwriting recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[49] Henry Lieberman,et al. Hearing aid: adding verbal hints to a learning interface , 1995, MULTIMEDIA '95.

[50] Robert F. Sproull,et al. Principles in interactive computer graphics , 1973 .

[51] Gregor Erbach. Tools for Grammar Engineering , 1992, ANLP.

[52] W. A. Woods,et al. Language processing for speech understanding , 1986 .

[53] Bjarne Stroustrup,et al. C++ Programming Language , 1986, IEEE Softw..

[54] Günther Görz,et al. Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[55] Wallace W. Tourtellotte,et al. Interaction , 1988 .

[56] Isabelle Guyon,et al. Design of a neural network character recognizer for a touch terminal , 1991, Pattern Recognit..

[57] Bill Buxton,et al. GEdit: a test bed for editing by contiguous gestures , 1991, SGCH.

[58] Stefan Manke. On-line Erkennung kursiver Handschrift bei großen Vokabularen , 1998 .

[59] John R. Anderson,et al. Induction of Augmented Transition Networks , 1977, Cogn. Sci..

[60] Mei-Yuh Hwang,et al. The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[61] O. Firschein,et al. Syntactic pattern recognition and applications , 1983, Proceedings of the IEEE.

[62] Philip R. Cohen. The role of natural language in a multimodal interface , 1992, UIST '92.

[63] James Glass,et al. The VOYAGER speech understanding system: preliminary development and evaluation , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[64] R. Wohlford,et al. Keyword recognition using template concatenation , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[65] Joëlle Coutaz,et al. A design space for multimodal systems: concurrent processing and data fusion , 1993, INTERCHI.

[66] Alexander H. Waibel,et al. Improving recognizer acceptance through robust, natural speech repair , 1994, ICSLP.

[67] Sangkyu Park,et al. Multimodal user interfaces in the Open Agent Architecture , 1997, IUI '97.

[68] Rob Miller,et al. The Amulet Environment: New Models for Effective User Interface Software Development , 1997, IEEE Trans. Software Eng..

[69] Sharon L. Oviatt,et al. Integration themes in multimodal human-computer interaction , 1994, ICSLP.

[70] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[71] Biing-Hwang Juang,et al. Combining key-phrase detection and subword-based verification for flexible speech understanding , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[72] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[73] Philip J. Hayes,et al. Flexible Parsing , 1980, ACL.

[74] Hiroyuki Kamio,et al. A UI design support tool for multimodal spoken dialogue system , 1994, ICSLP.

[75] M Berthod,et al. Learning in syntactic recognition of symbols drawn on a graphic tablet , 1979 .

[76] A. Harvey,et al. Neural network based segmentation of handwritten words , 1997 .

[77] Victor Lesser,et al. The hearsay-II speech understanding system: a tutorial , 1990 .

[78] Alex Waibel,et al. Consonant recognition by modular construction of large phonemic time-delay neural networks , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[79] Sargur N. Srihari,et al. Offline recognition of handwritten cursive words , 1992, Electronic Imaging.

[80] Kunio Nakajima,et al. A semantic interpretation based on detecting concepts for spontaneous speech understanding , 1994, ICSLP.

[81] Wayne H. Ward. Understanding spontaneous speech: the Phoenix system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[82] Takeo Kanade,et al. DigitEyes: Vision-Based Human Hand Tracking , 1993 .

[83] Sharon L. Oviatt,et al. Toward interface design for human language technology: Modality and structure as determinants of linguistic complexity , 1994, Speech Communication.

[84] Alexander H. Waibel,et al. NPen/sup ++/: a writer independent, large vocabulary on-line cursive handwriting recognition system , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[85] Paul McKevitt,et al. Integration of Natural Language and Vision Processing , 1996, Springer Netherlands.

[86] Hiroshi Yamada,et al. Cursive handwritten word recognition using multiple segmentation determined by contour analysis , 1996 .

[87] Thomas S. Huang,et al. Vision based hand modeling and tracking for virtual teleconferencing and telecollaboration , 1995, Proceedings of IEEE International Conference on Computer Vision.

[88] A. Waibel,et al. MULTIMODAL HUMAN-COMPUTER INTERACTION , 1993 .

[89] A. D. Milota,et al. Multimodal interfaces with voice and gesture input , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[90] William E. Lorensen,et al. Object-Oriented Modeling and Design , 1991, TOOLS.

[91] Aaron E. Rosenberg,et al. An investigation of the use of dynamic time warping for word spotting and connected speech recognition , 1980, ICASSP.

[92] S. Levinson,et al. Considerations in dynamic time warping algorithms for discrete word recognition , 1978 .

[93] Joëlle Coutaz,et al. A generic platform for addressing the multimodal challenge , 1995, CHI '95.

[94] John B. Shoven,et al. I , Edinburgh Medical and Surgical Journal.

[95] James S. Lipscomb. A trainable gesture recognizer , 1991, Pattern Recognit..

[96] Roger B. Dannenberg,et al. Garnet: comprehensive support for graphical, highly interactive user interfaces , 1995 .

[97] Jennifer L. Leopold,et al. Keyboardless visual programming using voice, handwriting, and gesture , 1997, Proceedings. 1997 IEEE Symposium on Visual Languages (Cat. No.97TB100180).

[98] Frederick Jelinek,et al. Self-organizing language modeling for speech recognition , 1990 .

[99] Ishwar K. Sethi,et al. Off-line cursive handwriting segmentation , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[100] David Sankoff,et al. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[101] Sandrine Robbe,et al. Towards usable multimodal command languages: definition and ergonomic assessment of constraints on users' spontaneous speech and gestures , 1997, EUROSPEECH.

[102] Ching,et al. The State of the Art in On-Line Handwriting Recognition , 2000 .

[103] Alex Waibel,et al. Tracking Human Faces in Real-Time, , 1995 .

[104] A. Gorin. On automated language acquisition , 1989 .

[105] Alexander H. Waibel,et al. Recognition of conversational telephone speech using the JANUS speech engine , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[106] Gyeonghwan Kim,et al. A Lexicon Driven Approach to Handwritten Word Recognition for Real-Time Applications , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[107] David J. Hand,et al. Kernel Discriminant Analysis , 1983 .

[108] Encarna Segarra,et al. INDUCTIVE LEARNING OF FINITE-STATE TRANSDUCERS FOR THE INTERPRETATION OF UNIDIMENSIONAL OBJECTS , 1990 .

[109] Lalit R. Bahl,et al. A tree-based statistical language model for natural language speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[110] Donald Ervin Knuth,et al. The Art of Computer Programming , 1968 .

[111] Jian Wang,et al. Integration of eye-gaze, voice and manual response in multimodal user interface , 1995, 1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century.

[112] Alon Lavie,et al. Janus-III: speech-to-speech translation in multiple languages , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[113] Minh Tue Vo,et al. Building an application framework for speech and pen input integration in multimodal learning interfaces , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[114] Hiroaki Sakoe,et al. A Dynamic Programming Approach to Continuous Speech Recognition , 1971 .

[115] T.H. Crystal,et al. Linear prediction of speech , 1977, Proceedings of the IEEE.

[116] Akira Kurematsu,et al. Linguistic and paralinguistic differences between multimodal and telephone-only dialogues , 1994, ICSLP.

[117] Shumeet Baluja,et al. Non-Intrusive Gaze Tracking Using Artificial Neural Networks , 1993, NIPS.

[118] Jaime G. Carbonell,et al. Recovery Strategies for Parsing Extragrammatical Language , 1983, CL.

[119] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[120] Sharon L. Oviatt,et al. A rapid semi-automatic simulation technique for investigating interactive speech and handwriting , 1992, ICSLP.

[121] David Zeltzer,et al. A survey of glove-based input , 1994, IEEE Computer Graphics and Applications.

[122] Grady Booch,et al. Object-Oriented Analysis and Design with Applications , 1990 .

[123] Nobuo Hataoka,et al. Evaluation of multimodal interface using spoken language and pointing gesture on interior design system , 1994, ICSLP.

[124] Paul D. Gader,et al. Applications of fuzzy set theory to handwriting recognition , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[125] Damaris M. Ayuso,et al. Gisting conversational speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[126] Mark W. Salisbury,et al. Talk and draw: bundling speech and graphics , 1990, Computer.

[127] Yoichi Takebayashi,et al. A real-time task-oriented speech understanding system using keyword-spotting , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[128] Joëlle Coutaz,et al. Applying the Wizard of Oz Technique to the Study of Multimodal Systems , 1993, EWHCI.

[129] David J. Burr,et al. Experiments on neural net recognition of spoken and written text , 1988, IEEE Trans. Acoust. Speech Signal Process..

[130] R.W. Schafer,et al. Digital representations of speech signals , 1975, Proceedings of the IEEE.

[131] Gyeonghwan Kim,et al. Paradigms in handwriting recognition , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[132] Vladimir Pavlovic,et al. A Multimodal framework for Interacting with Virtual Environments , 1996 .

[133] Joseph Picone,et al. Speech recognition in a unification grammar framework , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[134] Alon Lavie,et al. GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars , 1993, IWPT.

[135] Philip R. Cohen,et al. Synergistic use of direct manipulation and natural language , 1989, CHI '89.

[136] P. Haffner,et al. Multi-State Time Delay Neural Networks for Continuous Speech Recognition , 1991 .

[137] Sharon L. Oviatt,et al. Unification-based Multimodal Integration , 1997, ACL.

[138] F Neuberger. [Lip reading]. , 1971, Monatsschrift fur Ohrenheilkunde und Laryngo-Rhinologie.

[139] H. Bourlard,et al. Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[140] Hiroshi Murase,et al. Online hand-sketched figure recognition , 1986, Pattern Recognit..

[141] A.L. Gorin,et al. An experiment in spoken language acquisition , 1992, IEEE Trans. Speech Audio Process..

[142] Douglas E. Appelt,et al. GEMINI: A Natural Language System for Spoken-Language Understanding , 1993, ACL.

[143] Sharon L. Oviatt,et al. Multimodal interfaces for dynamic interactive maps , 1996, CHI.

[144] Robin L. Kullberg. Mark your calendar!: learning personalized annotation from integrated sketch and speech , 1995, CHI 95 Conference Companion.

[145] Ann Blandford,et al. Four easy pieces for assessing the usability of multimodal interaction: the CARE properties , 1995, INTERACT.

[146] Bruce Lowerre,et al. The Harpy speech understanding system , 1990 .

[147] Alexander H. Waibel,et al. Improving the MS-TDNN for word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[148] Albino Nogueiras,et al. Sethos: the UPC speech understanding system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[149] Stephen E. Levinson,et al. Adaptive acquisition of language , 1991 .

[150] Sharon Oviatt,et al. Multimodal interactive maps: designing for human performance , 1997 .

[151] Michael K. Brown,et al. A context-free grammar compiler for speech understanding systems , 1994, ICSLP.

[152] Dilip Krishnaswamy,et al. Classification of pen gestures using learning vector quantization , 1993, Optics & Photonics.

[153] Allen L. Gorin,et al. Processing of semantic information in fluently spoken language , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[154] Timothy Diller,et al. An automatic word spotting system for conversational speech , 1978, ICASSP.

[155] Alexander H. Waibel,et al. Multimodal interfaces for multimedia information agents , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[156] Yacine Bellik,et al. Media integration in multimodal interfaces , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[157] Frederick Jelinek,et al. Statistical methods for speech recognition , 1997 .

[158] Katsuhiko Shirai,et al. Multimodal drawing tool using speech, mouse and key-board , 1994, ICSLP.

[159] Alex Waibel,et al. Modeling and Interpreting Multimodal Inputs: A Semantic Integration Approach , 1997 .

[160] Ken Arnold,et al. The Java Programming Language , 1996 .

[161] H. William Buttelmann,et al. American Journal of Computational Linguistics , 1974 .

[162] Mei-Yuh Hwang,et al. Applying SPHINX-II to the DARPA Wall Street Journal CSR Task , 1992, HLT.

[163] Philip R. Cohen,et al. The contributing influence of speech and interaction on human discourse patterns , 1991 .

[164] Tyson R. Henry,et al. Integrating gesture and snapping into a user interface toolkit , 1990, UIST '90.

[165] Roger B. Dannenberg,et al. Garnet: comprehensive support for graphical, highly interactive user interfaces , 1990, Computer.

[166] Richard Lippmann,et al. Neural Net and Traditional Classifiers , 1987, NIPS.

[167] John K. Ousterhout,et al. Tcl and the Tk Toolkit , 1994 .

[168] Alex Waibel,et al. A multimodal human-computer interface: combination of speech and gesture recognition , 1996 .

[169] Alexander G. Hauptmann,et al. Speech and gestures for graphic image manipulation , 1989, CHI '89.

[170] Scott McGlashan,et al. OLGA - a dialogue system with an animated talking agent , 1997, EUROSPEECH.

[171] Alexander I. Rudnicky,et al. Spoken language interaction in a goal-directed task , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[172] Ralph Johnson,et al. design patterns elements of reusable object oriented software , 2019 .

[173] Allen Newell,et al. Speech understanding systems : Final report of a study group , 1973 .

[174] Seiichi Nakagawa,et al. An Input Interface with Speech and Touch Screen , 1994 .

[175] D. Dillman. Mail and telephone surveys : the total design method , 1979 .

[176] Chris Firth,et al. The Use of Command Language Grammar in a Design Tool , 1991, Int. J. Man Mach. Stud..

[177] Van Nostrand,et al. Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[178] B. Ripley,et al. Pattern Recognition , 1968, Nature.

[179] Richard P. Lippmann,et al. Techniques for information retrieval from voice messages , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[180] Isabelle Guyon,et al. On-line cursive script recognition using time-delay neural networks and hidden Markov models , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[181] Douglas Kozlay. Feature Extraction in an Optical Character Recognition Machine , 1971, IEEE Transactions on Computers.

[182] Rajit Gadh,et al. Multimodal interface for a virtual reality based computer aided design system , 1997, Proceedings of International Conference on Robotics and Automation.