On "deep" knowledge extraction from documents

SynDiKATe comprises a family of natural language understanding systems for automatically acquiring knowledge from real-world texts (e.g., information technology test reports, medical finding reports), and for transferring their content to formal representation structures which constitute a corresponding text knowledge base. We present a general system architecture which integrates requirements from the analysis of single sentences, as well as those of referentially linked sentences forming cohesive texts. Properly accounting for text cohesion phenomena is a prerequisite for the soundness and validity of the generated text representation structures. It is also crucial for any information system application making use of automatically generated text knowledge bases in a reliable way.

[1]  G Hripcsak,et al.  Evaluating Natural Language Processors in the Clinical Domain , 1998, Methods of Information in Medicine.

[2]  Claire Cardie,et al.  Evaluating an Information Extraction System , 1994 .

[3]  James G. Schmolze,et al.  The KL-ONE family , 1992 .

[4]  Fernando Gomez,et al.  The recognition and classification of concepts in understanding scientific texts , 1989, J. Exp. Theor. Artif. Intell..

[5]  Ashwin Ram,et al.  The Role of Ontology in Creative Understanding , 1996 .

[6]  Udo Hahn,et al.  ParseTalk about Sentence- and Text-Level Anaphora , 1995, EACL.

[7]  Udo Hahn,et al.  Towards Text Knowledge Engineering , 1998, AAAI/IAAI.

[8]  Steven L. Lytinen,et al.  The Ups and Downs of Lexical Acquisition , 1994, AAAI.

[9]  Martin Romacker,et al.  How knowledge drives understandingmatching medical ontologies with the needs of medical language processing , 1999, Artif. Intell. Medicine.

[10]  Udo Hahn,et al.  Let’s Parsetalk — Message-Passing Protocols for Object-Oriented Parsing , 2000 .

[11]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[12]  Steffen Staab,et al.  "Tall", "Good", "High" - Compared to What? , 1997, IJCAI.

[13]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[14]  Stefan Schulz,et al.  Knowledge Engineering by Large-Scale Knowledge Reuse - Experience from the Medical Domain , 2000, KR.

[15]  Douglas E. Appelt,et al.  FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[16]  Udo Hahn,et al.  A Conceptual Reasoning Approach to Textual Ellipsis , 1996, ECAI.

[17]  Timothy W. Finin,et al.  The KERNEL Text Understanding System , 1993, Artif. Intell..

[18]  Udo Hahn,et al.  Concurrent, object-oriented natural language parsing: the ParseTalk model , 1994, Int. J. Hum. Comput. Stud..

[19]  Martin Romacker,et al.  Lean Semantic Interpretation , 1999, IJCAI.

[20]  Udo Hahn,et al.  Functional Centering - Grounding Referential Coherence in Information Structure , 1999, Comput. Linguistics.

[21]  Udo Hahn,et al.  On the Interaction of Metonymies and Anaphora , 1997, IJCAI.

[22]  Steve Renals,et al.  Proceedings of the Ninth Text REtrieval Conference , 2001 .