Opportunities for Advanced Speech Processing in Military Computer-Based Systems*

The author presents a study of military applications of advanced speech processing technology which includes three major elements: (1) review and assessment of current efforts in military applications of speech technology. (2) identification of opportunities for future military applications of advanced speech technology, and (3) identification of problem areas where research in speech processing is needed to meet application requirements, and of current research thrusts which appear promising. Opportunities for advanced applications are identified by means of descriptions of several generic systems which would be possible with advances in speech technology and in system integration. These generic systems include an integrated multirate voice/data communications terminal, an interactive speech enhancement system, a voice-controlled pilot's associate system, advanced air traffic control training systems, a battle management command and control support system with spoken natural language interface, and a spoken language translation system. In the applications discussed, various spectral analysis and digitization techniques are teamed up with artificial intelligence schemes to allow a machine to recognize continuous-speech sentences with vocabularies running to thousands of words. >

[1]  Alan V. Oppenheim,et al.  Methods for noise cancellation based on the EM algorithm , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Robert Breaux,et al.  Voice technology in Navy training systems , 1983 .

[3]  R. Schwartz,et al.  The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  B. Beek,et al.  An assessment of the technology of automatic speech recognition for military applications , 1977 .

[5]  Eduard H. Hovy,et al.  New Possibilities in Machine Translation , 1989, HLT.

[6]  R. Little The Flight Evaluation of a Speech Recognition and a Speech Output System in an Advanced Cockpit Displays and Flight Management System for Helicopters , 1987 .

[7]  J. Baker,et al.  The DRAGON system--An overview , 1975 .

[8]  J. Junqua,et al.  Acoustic and perceptual studies of Lombard speech: application to isolated-words automatic speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[9]  Yariv Ephraim,et al.  A minimum mean square error approach for speech enhancement , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[10]  Alan M. Lesgold,et al.  Intelligent Training Systems , 1990 .

[11]  Victor Zue,et al.  White Paper on Spoken Language Systems , 1989, HLT.

[12]  G.R. Doddington,et al.  Speaker recognition—Identifying people by their voices , 1985, Proceedings of the IEEE.

[13]  H. Gish Robust discrimination in automatic speaker identification , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[14]  Quentin Summerfield,et al.  Separation of simultaneous voices , 1987, ECST.

[15]  Kai-Fu Lee,et al.  Automatic Speech Recognition , 1989 .

[16]  M. Taylor Voice input applications in aerospace , 1986 .

[17]  David B. Pisoni,et al.  Removal of Noise From Noise-Degraded Speech Signals , 1989 .

[18]  Roger K. Moore,et al.  Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[19]  D. B. Paul,et al.  Speaker stress-resistant continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[20]  Esther Levin,et al.  Word recognition using hidden control neural architecture , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[21]  S. Boll,et al.  Suppression of acoustic noise in speech using two microphone adaptive noise cancellation , 1980 .

[22]  Alphonse Chapanis,et al.  Interactive human communication , 1975 .

[23]  Arthur S. House The recognition of speech by machine : a bibliography , 1988 .

[24]  Edward McLarnon A method for reducing the transmission rate of a channel vocoder by using frame interpolation , 1978, ICASSP.

[25]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[26]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[27]  M. A. Zissman Cochannel Talker Interference Suppression , 1991 .

[28]  Madeleine Bates Rapid porting of the Parlance natural language interface , 1989 .

[29]  B D Plutchak VOICE OPERATED STATUS BOARDS FOR THE CATCC (CARRIER AIR TRAFFIC CONTROL CENTER) , 1988 .

[30]  Herman J. M. Steeneken Quality Evaluation of Speech Processing Systems , 1992 .

[31]  Hy Murveit,et al.  Integrating Speech and Natural-Language Processing , 1989, HLT.

[32]  S. Young Competitive training in hidden Markov models , 1990 .

[33]  John S. Bridle,et al.  Alpha-nets: A recurrent 'neural' network architecture with a hidden Markov model interpretation , 1990, Speech Commun..

[34]  Jean Schoentgen The recognition of speech by machine - a bibliography : Academic Press, London/San Diego, 1988, 498 pp., ISBN 0-12-356785-8 , 1989, Speech Commun..

[35]  Thomas F. Quatieri,et al.  Peak-to-RMS reduction of speech based on a sinusoidal model , 1991, IEEE Trans. Signal Process..

[36]  Alan Zissman Marc Co-channel talker interference suppression , 1990 .

[37]  Clifford J. Weinstein,et al.  Automatic talker activity labeling for co-channel talker interference suppression , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[38]  Clifford J. Weinstein,et al.  The VISTA Speech Enhancement System for AM Radio Broadcasting , 1990 .

[39]  John C. Hardwick,et al.  A 4.8 kbps multi-band excitation speech coder , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[40]  D. O'Shaughnessy,et al.  Speech enhancement using vector quantization and a formant distance measure , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[41]  Richard M. Schwartz,et al.  A segment vocoder at 150 b/s , 1983, ICASSP.

[42]  P. Beckett Voice Control of Cockpit Systems , 1987 .

[43]  Michael Picheny,et al.  Large vocabulary natural language continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[44]  N. Cooke RAE Bedford's Experience of Using Direct Voice Input (DVI) in the Cockpit , 1990 .

[45]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[46]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[47]  B. Van Beek,et al.  General review of military applications of voice processing , 1983 .

[48]  Bruno Beek,et al.  Military applications of automatic speech recognition and future requirements , 1977 .

[49]  Thomas E. Tremain,et al.  An evaluation of 4800 bps voice coders. , 1989 .

[50]  B. J. Stanton,et al.  Robust recognition of loud and Lombard speech in the fighter cockpit environment , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[51]  Rein Turn,et al.  Military Applications of Speech Understanding Systems , 1974 .

[52]  J. Makhoul,et al.  Vector quantization in speech coding , 1985, Proceedings of the IEEE.

[53]  R. P. Lippmann,et al.  Robust Speech Recognition Using Hidden Markov Models: Overview of a Research Program , 1990 .

[54]  Richard Lippmann,et al.  HMM Speech Recognition with Neural Net Discrimination , 1989, NIPS.

[55]  Thomas W. Parsons,et al.  Study and Development of the INTEL Technique for Improving Speech Intelligibility , 1975 .

[56]  V. Viswanathan,et al.  Noise-immune speech transduction using multiple sensors , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[57]  C. Lefebvre,et al.  A comparison of several acoustic representations for speech recognition with degraded and undegraded speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[58]  Frederick Jelinek,et al.  The development of an experimental discrete dictation recognizer , 1985, Proceedings of the IEEE.

[59]  George R. Doddington,et al.  Recognition of speech under stress and in noise , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[60]  S. Boll,et al.  Techniques for suppression of an interfering talker in co-channel speech , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[61]  Jean-Luc Gauvain,et al.  A Man-Machine Speech Communication System Including Word-Based Recognition and Text-to Speech Synthesis , 1983, IFIP Congress.

[62]  W. I. Thompson,et al.  Narrowband digital voice communications over a meteor burst channel , 1987 .

[63]  Ralph M. Weischedel,et al.  Portability in the Janus Natural Language Interface , 1989, HLT.

[64]  James Glass,et al.  The VOYAGER speech understanding system: preliminary development and evaluation , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[65]  Steve Young Competitive training in hidden Markov models (speech recognition) , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[66]  Jaime G. Carbonell,et al.  White Paper on Natural Language Processing , 1989, HLT.

[67]  J. Woodard,et al.  Selected military applications of automatic speech recognition technology , 1983, IEEE Communications Magazine.

[68]  Christopher C. Smyth,et al.  The Effect of Helicopter Vibration on the Accuracy of a Voice Recognition System , 1987 .

[69]  Joseph Tierney,et al.  Operational Acceptability of 2.4 Kbps Speech for Tactical Communications , 1985, MILCOM 1985 - IEEE Military Communications Conference.

[70]  John Cocke,et al.  A statistical approach to French/English translation , 1988, TMI.

[71]  F. Allan Rosenhoover AFTI/F-16 Voice Interactive Avionics Evaluation , 1987 .

[72]  Clifford J. Weinstein,et al.  Experience with Speech Communication in Packet Networks , 1983, IEEE J. Sel. Areas Commun..

[73]  K. Matrouf,et al.  Adapting probability-transitions in DP matching processing for an oral task-oriented dialogue , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[74]  Gary K. Poock Experiments with voice input for command and control: Using voice input to operate a distributed computer network , 1980 .

[75]  Marilyn L. Malpass,et al.  Frame-fill techniques for reducing vocoder data rates , 1981 .

[76]  Andrew Varga,et al.  Control experiments on noise compensation in hidden Markov model based continuous word recognition , 1989, EUROSPEECH.

[77]  B. Gold,et al.  Digital speech networks , 1977, Proceedings of the IEEE.

[78]  Joseph P. Campbell,et al.  An expandable error-protected 4800 bps CELP coder (US Federal Standard 4800 bps voice coder) , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[79]  Victor W. Zue,et al.  Phonetic classification using multi-layer perceptrons , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[80]  Jean-Luc Gauvain,et al.  Oral task-oriented dialogue for air-traffic controller training , 1990, Defense, Security, and Sensing.

[81]  Douglas B. Paul A CSR-NL Interface Specification Version 1.51 , 1989, HLT.

[82]  Biing-Hwang Juang,et al.  On the application of hidden Markov models for enhancing noisy speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[83]  Jae S. Lim,et al.  Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[84]  Douglas A. Reynolds,et al.  Text independent speaker identification using automatic acoustic segmentation , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[85]  R. J. McAulay,et al.  Computationally efficient sine-wave synthesis and its application to sinusoidal transform coding , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[86]  Douglas D. O'Shaughnessy,et al.  Speech communication : human and machine , 1987 .

[87]  Douglas B. Paul,et al.  An 800 bps adaptive vector quantization vocoder using a perceptual distance measure , 1983, ICASSP.

[88]  Thomas F. Quatieri,et al.  Noise reduction using a soft-decision sine-wave vector quantizer , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[89]  Lockwood W Reed Voice Interactive Systems Technology Avionics (VISTA) Program. , 1982 .

[90]  Jeffrey J. Rodriguez,et al.  Adaptive noise reduction in aircraft communication systems , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[91]  J. J. Garcia-Luna-Aceves,et al.  CCWS: A Computer-Based, Multimedia Information System , 1985, Computer.

[92]  R Cowan,et al.  A Flight Evaluation of Voice Interaction as a Component of an Integrated Helicopter Avionics System. , 1986 .

[93]  Michael D. Brown,et al.  An algorithm for connected word recognition , 1982, ICASSP.

[94]  B. J. Stanton,et al.  Acoustic-phonetic analysis of loud and Lombard speech in simulated cockpit conditions , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[95]  Mark R. Weiss,et al.  The Speech Enhancement Advanced Development Model. , 1978 .