INTERACTIVE SEMANTIC ANALYSIS OF TECHNICAL TEXTS

Sentence syntax is the basis for organizing semantic relations in TANKA, a project that aims to acquire knowledge from technical text. Other hallmarks include an absence of precoded domain‐specific knowledge; significant use of public‐domain generic linguistic information sources; involvement of the user as a judge and source of expertise; and learning from the meaning representations produced during processing. These elements shape the realization of the TANKA project: implementing a trainable text processing system to propose correct semantic interpretations to the user. A three‐level model of sentence semantics, including a comprehensive Case system, provides the framework for TANKA's representations. Text is first processed by the DIPETT parser, which can handle a wide variety of unedited sentences. The semantic analysis module HAIKU then semi‐automatically extracts semantic patterns from the parse trees and composes them into domain knowledge representations. HAIKU's dictionaries and main algorithm are described with the aid of examples and traces of user interaction. Encouraging experimental results are described and evaluated.

[1]  Dekai Wu An image-schematic system of thematic roles , 1993 .

[2]  Ralph Grishman,et al.  Acquisition of Selectional Patterns , 1992, COLING.

[3]  Stan Szpakowicz,et al.  Mixed-strategy matching in conceptual networks , 1991 .

[4]  Raymond J. Mooney,et al.  A general explanation-based learning mechanism and its application to narrative understanding , 1990 .

[5]  Uri Zernik,et al.  Shipping Departments vs. Shipping Pacemakers: Using Thematic Analysis to Improve Tagging Accuracy , 1992, AAAI.

[6]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[7]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[8]  Frank A. Srnad ja,et al.  From N-Grams to Collocations: An Evaluation of Xtract , 1991, ACL.

[9]  Fabio Ciravegna,et al.  Knowledge Extraction From Texts by Sintesi , 1992, COLING.

[10]  Mildred L. Larson,et al.  Meaning-Based Translation: A Guide to Cross-Language Equivalence , 1986 .

[11]  Judith P. Dick,et al.  A conceptual, case-relation representation of text for intelligent retrieval , 1991 .

[12]  Acquiring knowledge from text using multiple methodologies to accomplish text understanding , 1989 .

[13]  Sylvain Delisle,et al.  Text processing without a priori domain knowledge: semi-automatic linguistic analysis for incremental knowledge acquisition , 1994 .

[14]  Stan Matwin,et al.  Machine learning techniques in knowledge acquisition from text , 1992 .

[15]  Walter Anthony Cook,et al.  Case Grammar: Development of the Matrix Model (1970-1978) , 1981 .

[16]  Michael R. Brent,et al.  Automatic Acquisition of Subcategorization Frames from Tagged Text , 1991, HLT.

[17]  F. Gomez,et al.  Knowledge acquisition from natural language for expert systems based on classification problem-solving methods , 1990 .

[18]  Sylvain Delisle,et al.  Pattern matching for case analysis: a computational definition of closeness , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.

[19]  Sergei Nirenburg,et al.  The Subworld Concept Lexicon and the Lexicon Management System , 1987, Comput. Linguistics.

[20]  Joseph Weizenbaum,et al.  Contextual understanding by computers , 1967, CACM.

[21]  William W. Cohen Learning from Textbook Knowledge: A Case Study , 1990, AAAI.

[22]  Claire Cardie,et al.  A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis , 1993, AAAI.

[23]  Stan Szpakowicz,et al.  Planning in Conceptual Networks , 1991, ICCI.

[24]  Von-Wun Soo,et al.  An Empirical Study on Thematic Knowledge Acquisition Based on Syntactic Clues and Heuristics , 1993, ACL.

[25]  Paul S. Jacobs,et al.  Joining Statistics with NLP for Text Categorization , 1992, ANLP.

[26]  Sylvain Delisle,et al.  A BROAD-COVERAGE PARSER FOR KNOWLEDGE ACQUISITION FROM TECHNICAL TEXTS , 1991 .

[27]  François Rousselot,et al.  Elaboration de techniques d’analyse adaptees a la construction d’une base de connaissances , 1992, COLING.

[28]  Branimir K. Boguraev,et al.  A note on a study of cases , 1987 .

[29]  Simonetta Montemagni,et al.  Structural Patterns vs. String Patterns for Extracting Semantic Information from Dictionaries , 1992, COLING.

[30]  Harold L. Somers,et al.  Valency and case in computational linguistics , 1987 .

[31]  Bertram C. Bruce Case Systems for Natural Language , 1975, Artif. Intell..

[32]  Bernard Moulin,et al.  Automated knowledge acquisition from regulatory texts , 1992, IEEE Expert.

[33]  Victor. Loewen Ordinotrad a machine translation system based on case grammar. , 1989 .

[34]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[35]  Ralf D. Brown Human-Computer Interaction for Semantic Disambiguation , 1990, COLING.

[36]  Michael R. Brent Automatic Semantic Classification Of Verbs From Their Syntactic Contexts: An Implemented Classifier For Stativity , 1991, EACL.

[37]  Barbara Di Eugenio,et al.  Understanding Natural Language Instructions: The Case of Purpose Clauses , 1992, ACL.

[38]  Roberto Basili,et al.  Computational Lexicons: the Neat Examples and the Odd Exemplars , 1992, ANLP.

[39]  Gerald DeJong Automatic Schema Acquisition in a Natural Language Environment , 1982, AAAI.

[40]  Hervé Blanchon A Solution for the Problem of Interactive Disambiguation , 1992, COLING.

[41]  Steven J. DeRose,et al.  Grammatical Category Disambiguation by Statistical Optimization , 1988, CL.

[42]  F. J. Finaldo Deriving rules for medical expert systems using natural language parsing and discourse analysis , 1989 .

[43]  Lucy Vanderwende,et al.  Automatically Deriving Structured Knowledge Bases From On-Line Dictionaries , 1993 .

[44]  Paul S. Jacobs,et al.  Tagging for Learning: Collecting Thematic Relations from Corpus , 1990, COLING.

[45]  Joseph E. Grimes,et al.  The Thread of Discourse , 1984 .

[46]  Marianne Celce Murica,et al.  Verb Paradigms For Sentence Recognition , 1979, ACL Microfiche Series 1-83, Including Computational Linguistics.

[47]  Dan I. Moldovan,et al.  Acquisition of semantic patterns for information extraction from corpora , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[48]  Kathleen R. McKeown,et al.  Using collocations for language generation 1 , 1991 .

[49]  Lynette Hirschman,et al.  Improved Portability And Parsing Through Interactive Acquisition Of Semantic Information , 1988, ANLP.

[50]  P. M. M. David M. W. Powers ThC,et al.  Machine Learning of Natural Language , 1989, Springer London.

[51]  Sylvain Delisle,et al.  Parsing and Case Analysis in TANKA , 1992, COLING.

[52]  Frank Smadja,et al.  From N-Grams to Collocations: An Evaluation of Xtract , 1991, ACL.

[53]  Leixuan Yang,et al.  Path-Finding in Networks , 1994 .

[54]  David William Foster Case Grammar: Development of the Matrix Model (1970-1978) by Walter A. Cook (review) , 1980 .

[55]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[56]  KEN BARKER CLAUSE-LEVEL RELATIONSHIP ANALYSIS IN THE TANKA SYSTEM , 1994 .

[57]  Naomi Sager,et al.  Natural Language Information Processing: A Computer Grammar of English and Its Applications , 1980 .

[58]  Walter Cook,et al.  Case grammar: Development of the matrix model , 1979 .

[59]  Judith P. Dick Representation of legal text for conceptual retrieval , 1991, ICAIL '91.