From Text to Horn Clauses: Combining Linguistic Analysis and Machine Learning

The paper describes a system that extracts knowledge from technical English texts. Our basic assumption is that in technical texts syntax is a reliable indication of meaning. Consequently, semantic interpretation of the text starts from surface syntax. The linguistic component of the system uses a broad-coverage, domainindependent parser of English, as well as a user-assisted semantic interpreter that memorizes its experience. The resulting semantic structures are translated into Horn clauses, a representation suitable for Explanation-based Learning (EBL). An EBL engine performs symbollevel learning on representations of both the domain theory and the example provided by the linguistic part of the system. Our approach has been applied to the Canadian Individual Income Tax Guide and examples from it are used in the presentation.

[1]  William W. Cohen Learning from Textbook Knowledge: A Case Study , 1990, AAAI.

[2]  Von-Wun Soo,et al.  An Empirical Study on Thematic Knowledge Acquisition Based on Syntactic Clues and Heuristics , 1993, ACL.

[3]  Sylvain Delisle,et al.  Pattern matching for case analysis: a computational definition of closeness , 1993, Proceedings of ICCI'93: 5th International Conference on Computing and Information.

[4]  John B. Black,et al.  Understanding expository text : a theoretical and practical handbook for analyzing explanatory text , 1986 .

[5]  Sylvain Delisle,et al.  A BROAD-COVERAGE PARSER FOR KNOWLEDGE ACQUISITION FROM TECHNICAL TEXTS , 1991 .

[6]  Fabio Ciravegna,et al.  Knowledge Extraction From Texts by Sintesi , 1992, COLING.

[7]  Bernard Moulin,et al.  Automated knowledge acquisition from regulatory texts , 1992, IEEE Expert.

[8]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[9]  Sylvain Delisle,et al.  Text processing without a priori domain knowledge: semi-automatic linguistic analysis for incremental knowledge acquisition , 1994 .

[10]  Jan Svartvik,et al.  A __ comprehensive grammar of the English language , 1988 .

[11]  Dan I. Moldovan,et al.  Acquisition of semantic patterns for information extraction from corpora , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[12]  Dania Egedi,et al.  A Freely Available Wide Coverage Morphological Analyzer for English , 1992, COLING.

[13]  Lisa F. Rau,et al.  Innovations in Text Interpretation , 1993, Artif. Intell..

[14]  F. Gomez,et al.  Knowledge acquisition from natural language for expert systems based on classification problem-solving methods , 1990 .

[15]  Sylvain Delisle,et al.  INTERACTIVE SEMANTIC ANALYSIS OF TECHNICAL TEXTS , 1996, Comput. Intell..

[16]  David E. Kieras,et al.  Thematic Processes in the Comprehension of Technical Prose. , 1982 .

[17]  P. M. M. David M. W. Powers ThC,et al.  Machine Learning of Natural Language , 1989, Springer London.

[18]  Oren Etzioni,et al.  Explanation-Based Learning: A Problem Solving Perspective , 1989, Artif. Intell..

[19]  Kenneth Silvestro Using Explanations for Knowledge-Base Acquisition , 1988, Int. J. Man Mach. Stud..