Continuous Speech Recognition for Clinicians

The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute ‘‘twice a day’’ for ‘‘bid’’ when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians’ offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal ‘‘usable’’ hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a ‘‘speech quality’’ sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate. The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription. n JAMIA. 1999;6:195–204. Affiliation of the authors: Indiana University, Regenstrief Institute for Health Care, Indianapolis, Indiana. This work was supported in part by contracts N01-LM43510 and N01-6-3546 from the National Library of Medicine. Correspondence and reprints: Atif Zafar, MD, Regenstrief Institute for Health Care, 1001 West 10th Street, RHC 5th Floor, Indianapolis, IN 46202-2859. e-mail: ^zafaroa@regenstrief.iupui.edu&. Received for publication: 10/12/98; accepted for publication: 1/11/99. Medical informaticians have struggled with capturing physician-generated clinical data for a quarter century. A variety of approaches, ranging from direct typing – 4 to menuand macro-based clinical note generation, have been studied and are being sold commercially (Table 1 provides a sampling of such vendors). Most physicians prefer dictation because it is simple, familiar, and fast. Furthermore, transcribed notes are suitable for entry into a computerized medical record. However, manual transcription incurs delays of hours or days and is expensive. Transcription D ow naded rom http/academ ic.p.com jam ia/article-act/6/3/195/766046 by gest on 02 Feruary 2019 196 ZAFAR ET AL., Continuous Speech Recognition

[1]  Sherry Perdue Casali,et al.  Effects of Recognition Accuracy and Vocabulary Size of a Speech Recognition System on Task Performance and User Acceptance , 1990 .

[2]  H L Bleich,et al.  Computerized radiologic reporting with voice data-entry. , 1981, Radiology.

[3]  R A Reed Voice recognition for the radiology market. , 1992, Topics in health record management.

[4]  Beverly H. Williges,et al.  User feedback requirements with automatic speech recognition , 1985 .

[5]  J. Makhoul,et al.  The voice of the computer is heard in the land (and it listens too!) [speech recognition] , 1997 .

[6]  Y Suzuki,et al.  What does voice-processing technology support today? , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Alex Waibel,et al.  Readings in speech recognition , 1990 .

[8]  H L Bleich,et al.  Advances in radiologic reporting with Computerized Language Information Processing (CLIP). , 1979, Radiology.

[9]  J. Marc Overhage,et al.  The Regenstrief Medical Record System 1998: A System for City-Wide Computing , 1998, AMIA.

[10]  Charles Safran,et al.  Going Paperless: Can It Be Done? , 1997, AMIA.

[11]  H L Chin,et al.  Implementation of a comprehensive computer-based patient record system in Kaiser Permanente's Northwest Region. , 1997, M.D. computing : computers in medical practice.

[12]  J Makhoul,et al.  State of the art in continuous speech recognition. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Lawrence M. Fagan,et al.  Development of a Controlled Medical Terminology: Knowledge Acquisition and Knowledge Representation , 1995, Methods of Information in Medicine.

[14]  Lawrence M. Fagan,et al.  Research Paper: A Continuous-speech Interface to a Decision Support System: II. An Evaluation Using a Wizard-of-Oz Experimental Paradigm , 1995, J. Am. Medical Informatics Assoc..

[15]  Stephen M. Borowitz Computer-Based Speech Recognition as a Replacement for Medical Transcription , 1999 .