Why Discourse Structures in Medical Reports Matter for the Validity of Automatically Generated Text Knowledge Bases

The automatic analysis of medical full-texts currently suffers from neglecting text coherence phenomena such as reference relations between discourse units. This has unwarranted effects on the description adequacy of medical knowledge bases automatically generated from texts. The resulting representation bias can be characterized in terms of artificially fragmented, incomplete and invalid knowledge structures. We discuss three types of textual phenomena (pronominal and nominal anaphora, as well as textual ellipsis) and outline basic methodologies how to deal with them.

[1]  Martin Romacker,et al.  Text structures in medical text processing: empirical evidence and a text understanding prototype , 1997, AMIA.

[2]  P Zweigenbaum,et al.  A multi-lingual architecture for building a normalised conceptual representation from medical language. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[3]  D A Evans,et al.  Automating concept identification in the electronic medical record: an experiment in extracting dosage information. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[4]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[5]  James G. Schmolze,et al.  The KL-ONE family , 1992 .

[6]  Udo Hahn,et al.  Functional Centering , 1996, ACL.

[7]  Udo Hahn,et al.  Concurrent, object-oriented natural language parsing: the ParseTalk model , 1994, Int. J. Hum. Comput. Stud..

[8]  S. Chipman The Remembered Present: A Biological Theory of Consciousness , 1990, Journal of Cognitive Neuroscience.

[9]  A. Rector,et al.  A Terminology Server for Medical Language and Medical Information Systems , 1995, Methods of Information in Medicine.

[10]  Udo Hahn,et al.  A Conceptual Reasoning Approach to Textual Ellipsis , 1996, ECAI.

[11]  Lynette Hirschman,et al.  Retrieving time information from natural-language texts , 1980, SIGIR '80.

[12]  W. Hersh Information Retrieval: A Health Care Perspective , 1995, Computers and Medicine.

[13]  J R Scherrer,et al.  Natural Language Processing and Semantical Representation of Medical Texts , 1992, Methods of Information in Medicine.

[14]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.