Crowdsourced Continuous Improvement of Medical Speech Recognition

We describe a method for continuously improving the accuracy of a large-scale medical automatic speech recognizer (ASR) using a multi-step cycle involving several groups of workers. The paper will address the unique challenges of the medical domain, and discuss how automatically created and crowdsourced input data is combined to refine the ASR language models. The improvement cycle helped to decrease the original system’s word error rate from 34.1% to 10.4%, which approaches the accuracy of human transcribers trained in medical transcription.

[1]  H L Bleich,et al.  Computerized radiologic reporting with voice data-entry. , 1981, Radiology.

[2]  H. Sacks,et al.  Readings in Medical Artificial Intelligence: The First Decade , 1985 .

[3]  G A Akers Using your voice: speech recognition technology in medicine and surgery. , 1986, Clinics in plastic surgery.

[4]  T A Iinuma,et al.  Automatic radiologic reporting system using speech recognition. , 1987, Medical progress through technology.

[5]  B. Scharnberg,et al.  Speech processing in radiology , 1999, European Radiology.

[6]  Arthur C. Curtis,et al.  Technology Evaluation: Comparative Evaluation of Three Continuous Speech Recognition Software Packages in the Generation of Medical Reports , 2000, J. Am. Medical Informatics Assoc..

[7]  J.,et al.  Continuous Speech Recognition for Clinicians , 2000 .

[8]  Kevin C. Smith,et al.  A discrete speech recognition system for dermatology: 8 years of daily experience in a medical dermatology office. , 2002, Seminars in cutaneous medicine and surgery.

[9]  David Suendermann-Oeft,et al.  From rule-based to statistical grammars: Continuous improvement of large-scale spoken dialog systems , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Curtis P. Langlotz,et al.  Improving language models for radiology speech recognition , 2009, J. Biomed. Informatics.

[11]  J Zvárová,et al.  Voice-supported electronic health record for temporomandibular joint disorders. , 2010, Methods of information in medicine.

[12]  Gökhan Tür,et al.  Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions , 2011, J. Am. Medical Informatics Assoc..

[13]  Hong Yu,et al.  AskHERMES: An online question answering system for complex clinical questions , 2011, J. Biomed. Informatics.

[14]  David Suendermann,et al.  Crowdsourcing for Industrial Spoken Dialog Systems , 2013 .

[15]  Linda Dawson,et al.  A systematic review of speech recognition technology in health care , 2014, BMC Medical Informatics and Decision Making.

[16]  Christian Bellemare,et al.  Speech Recognition in the Radiology Department: A Systematic Review , 2015, Health information management : journal of the Health Information Management Association of Australia.

[17]  Salvatore A. Sanders,et al.  Speech recognition acceptance by physicians: A temporal replication of a survey of expectations and experiences , 2016, Health Informatics J..

[18]  Loes M. M. Braun,et al.  Natural Language Processing in Radiology: A Systematic Review. , 2016, Radiology.

[19]  Dimitrios Mitsouras,et al.  Natural Language Processing Technologies in Radiology Research and Clinical Applications. , 2016, Radiographics : a review publication of the Radiological Society of North America, Inc.

[20]  Brian J. Bartholmai,et al.  Syntactic and semantic errors in radiology reports associated with speech recognition software , 2017, Health Informatics J..