A maximum entropy approach to information extraction from semi-structured and free text

In this paper, we present a classification-based approach towards single-slot as well as multi-slot information extraction (IE). For single-slot IE, we worked on the domain of Seminar Announcements, where each document contains information on only one seminar. For multi-slot IE, we worked on the domain of Management Succession. For this domain, we restrict ourselves to extracting information sentence by sentence, in the same way as (Soderland 1999). Each sentence can contain information on several management succession events. By using a classification approach based on a maximum entropy framework, our system achieves higher accuracy than the best previously published results in both domains.

[1]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[2]  Ralph Grishman,et al.  Unsupervised Discovery of Scenario-Level Patterns for Information Extraction , 2000, ANLP.

[3]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[4]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[6]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[7]  Andrew McCallum,et al.  Information Extraction with HMMs and Shrinkage , 1999 .

[8]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[9]  M. Cali,et al.  Relational learning techniques for natural language information extraction , 1998 .

[10]  Ricky K. Taira,et al.  A statistical natural language processor for medical reports , 1999, AMIA.

[11]  David Fisher,et al.  Description of the UMass system as used for MUC-6 , 1995, MUC.

[12]  Dan Roth,et al.  Relational Learning via Propositional Algorithms: An Information Extraction Case Study , 2001, IJCAI.

[13]  Dayne Freitag,et al.  Boosted Wrapper Induction , 2000, AAAI/IAAI.

[14]  Ralph Grishman,et al.  Adaptive Information Extraction and Sublanguage Analysis , 2001 .

[15]  Dayne Freitag,et al.  Information Extraction from HTML: Application of a General Machine Learning Approach , 1998, AAAI/IAAI.

[16]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[17]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[18]  Stephen G. Soderland Building a Machine Learning Based Text Understanding System , 2001 .

[19]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[20]  Michael Collins,et al.  Semantic Tagging using a Probabilistic Context Free Grammar , 1998, VLC@COLING/ACL.

[21]  Fabio Ciravegna,et al.  Adaptive Information Extraction from Text by Rule Induction and Generalisation , 2001, IJCAI.