Automatic Understanding of ATC Speech: Study of Prospectives and Field Experiments for Several Controller Positions

Although there has been a lot of interest in recognizing and understanding air traffic control (ATC) speech, none of the published works have obtained detailed field data results. We have developed a system able to identify the language spoken and recognize and understand sentences in both Spanish and English. We also present field results for several in-tower controller positions. To the best of our knowledge, this is the first time that field ATC speech (not simulated) is captured, processed, and analyzed. The use of stochastic grammars allows variations in the standard phraseology that appear in field data. The robust understanding algorithm developed has 95% concept accuracy from ATC text input. It also allows changes in the presentation order of the concepts and the correction of errors created by the speech recognition engine improving it by 17% and 25%, respectively, absolute in the percentage of fully correctly understood sentences for English and Spanish in relation to the percentages of fully correctly recognized sentences. The analysis of errors due to the spontaneity of the speech and its comparison to read speech is also carried out. A 96% word accuracy for read speech is reduced to 86% word accuracy for field ATC data for Spanish for the "clearances" task confirming that field data is needed to estimate the performance of a system. A literature review and a critical discussion on the possibilities of speech recognition and understanding technology applied to ATC speech are also given.

[1]  Elaine Pfleiderer,et al.  RELATIONSHIPS BETWEEN MEASURES OF AIR TRAFFIC CONTROLLER VOICE COMMUNICATIONS , TASKLOAD , AND TRAFFIC COMPLEXITY , 2003 .

[2]  Helmer Strik,et al.  Automatic Speech Recognition for second language learning: How and why it actually works , 2003 .

[3]  William M. Campbell,et al.  Support vector machines for speaker and language recognition , 2006, Comput. Speech Lang..

[4]  Fernando Fernández Martínez,et al.  New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understanding , 2005, INTERSPEECH.

[5]  H Hering COMPARATIVE EXPERIMENTS WITH SPEECH RECOGNIZERS FOR ATC SIMULATIONS , 1998 .

[6]  B. Beek,et al.  An assessment of the technology of automatic speech recognition for military applications , 1977 .

[7]  Maxine Eskénazi,et al.  Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.

[8]  Glenn Taylor,et al.  Automating Simulation-Based Air Traffic Control , 2005 .

[9]  Pietro Laface,et al.  Compensation of Nuisance Factors for Speaker and Language Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Javier Macías Guarasa,et al.  Language identification techniques based on full recognition in an air traffic control task , 2004, INTERSPEECH.

[11]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[12]  Robert Breaux,et al.  Voice technology in Navy training systems , 1983 .

[13]  Michael A. Grasso The long-term adoption of speech recognition in medical applications , 2003, 16th IEEE Symposium Computer-Based Medical Systems, 2003. Proceedings..

[14]  Rebecca Hincks Speech recognition for language teaching and evaluating: a study of existing commercial products , 2002, INTERSPEECH.

[15]  Marc A. Zissman,et al.  Comparison of : Four Approaches to Automatic Language Identification of Telephone Speech , 2004 .

[16]  Javier Ferreiros,et al.  Improving continuous speech recognition in Spanish by phone-class semicontinuous HMMs with pausing and multiple pronunciations , 1999, Speech Commun..

[17]  Carol A. Manning,et al.  USING AIR TRAFFIC CONTROL TASKLOAD MEASURES AND COMMUNICATION EVENTS TO PREDICT SUBJECTIVE WORKLOAD , 2002 .

[18]  E.T. Hvannberg,et al.  Language technology in air traffic control , 2003, Digital Avionics Systems Conference, 2003. DASC '03. The 22nd.

[19]  Douglas A. Reynolds,et al.  Approaches to language identification using Gaussian mixture models and shifted delta cepstral features , 2002, INTERSPEECH.

[20]  H H Koester,et al.  User Performance With Speech Recognition: A Literature Review , 2001, Assistive technology : the official journal of RESNA.

[21]  Eugene L. Duke,et al.  Turning PINOCCHIO Into A Real Boy: Satisfying A Turing Test For UA Operating In The NAS , 2007 .

[22]  Min-Jea Tahk,et al.  IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS , 2022, IEEE Aerospace and Electronic Systems Magazine.

[23]  Clifford J. Weinstein,et al.  Opportunities for Advanced Speech Processing in Military Computer-Based Systems* , 1990, HLT.

[24]  J. M. Rankin,et al.  Controller interface for controller-pilot data link communications , 1997, 16th DASC. AIAA/IEEE Digital Avionics Systems Conference. Reflections to the Future. Proceedings.

[25]  J.M. Pardo,et al.  Automatic Understanding of ATC Speech , 2006, IEEE Aerospace and Electronic Systems Magazine.

[26]  Bin Ma,et al.  Multilingual speech recognition with language identification , 2002, INTERSPEECH.

[27]  Martin Cmejrek,et al.  CarDialer: multi-modal in-vehicle cellphone control application , 2006, ICMI '06.

[28]  Kim Cardosi,et al.  Pilot-Controller Communication Errors: An Analysis of Aviation Safety Reporting System (ASRS) Reports. , 1998 .

[29]  Klaus Eyferth,et al.  A model of air traffic controllers' conflict detection and conflict resolution , 2003 .

[30]  Farzad Ehsani,et al.  Speech Technology in Computer-Assisted Language Learning: Strengths and Limitations of a New CALL Paradigm. , 1998 .

[31]  A. Lechner,et al.  Voice recognition: software solutions in real-time ATC workstations , 2002 .

[32]  Wayne H. Ward,et al.  Recent Improvements in the CMU Spoken Language Understanding System , 1994, HLT.