Research Paper: Natural Language Processing and the Representation of Clinical Data

OBJECTIVE Develop a representation of clinical observations and actions and a method of processing free-text patient documents to facilitate applications such as quality assurance. DESIGN The Linguistic String Project (LSP) system of New York University utilizes syntactic analysis, augmented by a sublanguage grammar and an information structure that are specific to the clinical narrative, to map free-text documents into a database for querying. MEASUREMENTS Information precision (I-P) and information recall (I-R) were measured for queries for the presence of 13 asthma-health-care quality assurance criteria in a database generated from 59 discharge letters. RESULTS I-P, using counts of major errors only, was 95.7% for the 28-letter training set and 98.6% for the 31-letter test set. I-R, using counts of major omissions only, was 93.9% for the training set and 92.5% for the test set.

[1]  J J Cimino,et al.  Representation of clinical laboratory terminology in the Unified Medical Language System. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[2]  H. Warner,et al.  An interlingua for electronic interchange of medical information: using frames to map between clinical vocabularies. , 1991, Computers and biomedical research, an international journal.

[3]  Bruce G. Buchanan,et al.  The MYCIN Experiments of the Stanford Heuristic Programming Project , 1985 .

[4]  C G Chute,et al.  Latent Semantic Indexing of medical diagnoses using UMLS semantic structures. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[5]  Nicoletta Calzolari,et al.  Review of Medical language processing: computer management of narrative data by Naomi Sager, Carol Friedman, and Margaret S. Lyman. Addison-Wesley 1987. , 1989 .

[6]  NAOMI SAGER,et al.  Syntactic Analysis of Natural Language , 1967, Adv. Comput..

[7]  S Wolff The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. , 1984, Methods of information in medicine.

[8]  S Shiffman,et al.  A free-text processing system to capture physical findings: Canonical Phrase Identification System (CAPIS). , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[9]  Zellig S. Harris,et al.  Language and information , 1988 .

[10]  Naomi Sager,et al.  Natural Language Information Formatting: The Automatic Conversion of Texts to a Structured Data Base , 1978, Adv. Comput..

[11]  Zellig S. Harris,et al.  Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.

[12]  B. Blum,et al.  History of Medical Informatics , 1990, Yearbook of Medical Informatics.

[13]  N Sager,et al.  An experiment in automated health care evaluation from narrative medical records. , 1981, Computers and biomedical research, an international journal.

[14]  John F. Sowa,et al.  Conceptual Structures: Information Processing in Mind and Machine , 1983 .

[15]  L A Lenert,et al.  Monitoring free-text data using medical language processing. , 1993, Computers and biomedical research, an international journal.

[16]  J K Vries,et al.  An automated indexing system utilizing semantic net expansion. , 1992, Computers and biomedical research, an international journal.

[17]  Ralph Grishman,et al.  Question Answering from Natural Language Medical Data Bases , 1978, Artif. Intell..

[18]  A T McCray,et al.  Extending a natural language parser with UMLS knowledge. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[19]  N Sager,et al.  Computerized language processing: implications for health care evaluation. , 1978, Medical record news.

[20]  E. Shortliffe,et al.  Readings in medical artificial intelligence: the first decade , 1984 .

[21]  Richard Kittredge,et al.  Sublanguage : studies of language in restricted semantic domains , 1982 .

[22]  Gabrieli Er Computerizing text from office records. , 1987 .

[23]  Shamkant B. Navathe,et al.  Conceptual Database Design: An Entity-Relationship Approach , 1991 .

[24]  C. Bucknall,et al.  Management of asthma in hospital: a prospective audit , 1988, British medical journal.

[25]  J R Scherrer,et al.  Natural Language Processing and Semantical Representation of Medical Texts , 1992, Methods of Information in Medicine.

[26]  J R Scherrer,et al.  TEXTINFO: a tool for automatic determination of patient clinical profiles using text analysis. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[27]  Ralph Grishman,et al.  The linguistic string parser , 1973, AFIPS National Computer Conference.

[28]  J R Scherrer,et al.  The Application of Natural-language Processing to Healthcare Quality Assessment , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[29]  F. Moran,et al.  DIFFERENCES IN HOSPITAL ASTHMA MANAGEMENT , 1988, The Lancet.

[30]  Mark S. Tuttle,et al.  Implementing Meta-1: The First Version of the UMLS Metathesaurus*. , 1989 .

[31]  Ralph Grishman,et al.  Grammatically-based automatic word class formation , 1975, Inf. Process. Manag..

[32]  Yun Su,et al.  A Medical Language Processor for Two Indo-European Languages. , 1989 .

[33]  D. Lindberg,et al.  Building the Unified Medical Language System , 1989 .

[34]  Naomi Sager,et al.  Syntactic formatting of science information , 1972, AFIPS '72 (Fall, part II).

[35]  N Sager,et al.  Automatic encoding of clinical narrative. , 1982, Computers in biology and medicine.

[36]  Lynette Hirschman,et al.  Representing Implicit And Explicit Time Relations In Narrative , 1981, IJCAI.

[37]  Lynette Hirschman,et al.  Retrieving time information from natural-language texts , 1980, SIGIR '80.

[38]  E. Shortliffe Clinical decision-support systems , 1990 .

[39]  F E Masarie,et al.  Quick medical reference (QMR) for diagnostic assistance. , 1986, M.D.Computing.

[40]  H. E. Pople,et al.  Internist-I, an Experimental Computer-Based Diagnostic Consultant for General Internal Medicine , 1982 .

[41]  G. Barnett,et al.  DXplain. An evolving diagnostic decision-support system. , 1987, JAMA.

[42]  N Sager,et al.  Developing a database from free-text clinical data. , 1983, Journal of clinical computing.

[43]  M A Musen,et al.  Dimensions of knowledge sharing and reuse. , 1992, Computers and biomedical research, an international journal.

[44]  Ralph Grishman,et al.  The restriction language for computer grammars of natural language , 1975, CACM.

[45]  M A Musen,et al.  Representation of clinical data using SNOMED III and conceptual graphs. , 1992, Proceedings. Symposium on Computer Applications in Medical Care.