Research Paper: The Role of Domain Knowledge in Automating Medical Text Report Classification

OBJECTIVE To analyze the effect of expert knowledge on the inductive learning process in creating classifiers for medical text reports. DESIGN The authors converted medical text reports to a structured form through natural language processing. They then inductively created classifiers for medical text reports using varying degrees and types of expert knowledge and different inductive learning algorithms. The authors measured performance of the different classifiers as well as the costs to induce classifiers and acquire expert knowledge. MEASUREMENTS The measurements used were classifier performance, training-set size efficiency, and classifier creation cost. RESULTS Expert knowledge was shown to be the most significant factor affecting inductive learning performance, outweighing differences in learning algorithms. The use of expert knowledge can affect comparisons between learning algorithms. This expert knowledge may be obtained and represented separately as knowledge about the clinical task or about the data representation used. The benefit of the expert knowledge is more than that of inductive learning itself, with less cost to obtain. CONCLUSION For medical text report classification, expert knowledge acquisition is more significant to performance and more cost-effective to obtain than knowledge discovery. Building classifiers should therefore focus more on acquiring knowledge from experts than trying to learn this knowledge inductively.

[1]  L A Lenert,et al.  Monitoring free-text data using medical language processing. , 1993, Computers and biomedical research, an international journal.

[2]  Xiao-Hua Zhou,et al.  Research Paper: Using Computer-based Medical Records to Predict Mortality Risk for Inner-city Patients with Reactive Airways Disease , 1997, J. Am. Medical Informatics Assoc..

[3]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[4]  W. Bruce Croft,et al.  Research Paper: Ad Hoc Classification of Radiology Reports , 1999, J. Am. Medical Informatics Assoc..

[5]  Peter J. Haug,et al.  Comparing expert systems for identifying chest x-ray reports that support pneumonia , 1999, AMIA.

[6]  Peter Spyns Natural Language Processing in Medicine: An Overview , 1996, Methods of Information in Medicine.

[7]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[8]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[9]  George Hripcsak,et al.  Knowledge discovery and data mining to assist natural language understanding , 1998, AMIA.

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[12]  J. Grier,et al.  Nonparametric indexes for sensitivity and bias: computing formulas. , 1971, Psychological bulletin.

[13]  David W. Aha,et al.  Analyses of Instance-Based Learning Algorithms , 1991, AAAI.

[14]  Christopher G. Chute,et al.  Position Paper: A Framework for Comprehensive Health Terminology Systems in the United States: Development Guidelines, Criteria for Selection, and Public Policy Implications , 1998, J. Am. Medical Informatics Assoc..

[15]  G. Hripcsak,et al.  Extracting Findings from Narrative Reports: Software Transferability and Sources of Physician Disagreement , 1998, Methods of Information in Medicine.

[16]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[17]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[18]  Carol Friedman,et al.  Identification of findings suspicious for breast cancer based on natural language processing of mammogram reports , 1997, AMIA.

[19]  George Hripcsak,et al.  Classification algorithms applied to narrative reports , 1999, AMIA.

[20]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[21]  Stefan Wrobel,et al.  Proceedings of the 8th European Conference on Machine Learning , 1995 .

[22]  J R Campbell,et al.  A framework for comprehensive health terminology systems in the United States: development guidelines, criteria for selection, and public policy implications. ANSI Healthcare Informatics Standards Board Vocabulary Working Group and the Computer-Based Patient Records Institute Working Group on Codes , 1998, Journal of the American Medical Informatics Association : JAMIA.

[23]  Brian R. Gaines An Ounce of Knowledge is Worth a Ton of Data: Quantitative studies of the Trade-Off between Expertise and Data Based On Statistically Well-Founded Empirical Induction , 1989, ML.

[24]  George Hripcsak,et al.  Medical text representations for inductive learning , 2000, AMIA.

[25]  Peter J. Haug,et al.  A Comparison of Classification Algorithms to Automatically Identify Chest X-Ray Reports That Support Pneumonia , 2001, J. Biomed. Informatics.

[26]  Steven L. Salzberg On Comparing Classifiers: A Critique of Current Research and Methods , 1999 .

[27]  A. R. Mirzai,et al.  Artificial intelligence: concepts and applications in engineering , 1990 .

[28]  N L Jain,et al.  Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[29]  Stan Matwin,et al.  Using Qualitative Models to Guide Inductive Learning , 1993, ICML.

[30]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[31]  George Hripcsak,et al.  Automating a severity score guideline for community-acquired pneumonia employing medical language processing of discharge summaries , 1999, AMIA.

[32]  J. Gaspoz,et al.  Distinction between Planned and Unplanned Readmissions following Discharge from a Department of Internal Medicine , 1999, Methods of Information in Medicine.

[33]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[34]  Konrad Lang,et al.  Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal pain - Acute Abdominal Pain Study Group , 1996, Artif. Intell. Medicine.

[35]  William R. Hersh,et al.  Automatic Prediction of Trauma Registry Procedure Codes from Emergency Room Dictations , 1998, MedInfo.

[36]  Peter Clark Machine learning: techniques and recent developments , 1990 .

[37]  J. Concato,et al.  Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. , 1995, Journal of clinical epidemiology.

[38]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[39]  W. DuMouchel,et al.  Unlocking Clinical Data from Narrative Reports: A Study of Natural Language Processing , 1995, Annals of Internal Medicine.