A Theory-Refinement Approach to Information Extraction

We investigate applying theory renemen t to the task of extracting information from text. In theory renemen t, partial domain knowledge (which may be incorrect) is given to a supervised learner. The provided knowledge guides the learner in its task, but the learner can rene or even discard this knowledge during training. Our supervised learner is a \knowledge-based" neural network that initially contains \compiled" prior knowledge about a particular information extraction (IE) task. The prior knowledge needs to specify the extraction slots for the specic IE task. Our approach uses generate-and-test to address the IE task. In the generation step, we produce candidate extractions by intelligently searching the space of possible extractions. In the test step, we use the trained network to judge each candidate and output those that exceed a system-selected threshold. Experiments on the CMU seminarannouncements and the Yeast subcellularlocalization domains demonstrate our approach’s value.

[1]  Tim Leek,et al.  Information Extraction Using Hidden Markov Models , 1997 .

[2]  M. Cali,et al.  Relational learning techniques for natural language information extraction , 1998 .

[3]  Raymond J. Mooney,et al.  Relational learning techniques for natural language information extraction , 1998 .

[4]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[5]  Tina Eliassi-Rad,et al.  An instructable, adaptive interface for discovering and monitoring information on the World-Wide Web , 1998, IUI '99.

[6]  Bart Selman,et al.  Local search strategies for satisfiability testing , 1993, Cliques, Coloring, and Satisfiability.

[7]  Mark Craven,et al.  Representing Sentence Structure in Hidden Markov Models for Information Extraction , 2001, IJCAI.

[8]  Tina Eliassi-Rad,et al.  Intelligent Agents for Web-based Tasks: An Advice-Taking Approach , 1998 .

[9]  Roni Rosenfeld,et al.  Learning Hidden Markov Model Structure for Information Extraction , 1999 .

[10]  Andrew McCallum,et al.  Information Extraction with HMMs and Shrinkage , 1999 .

[11]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[12]  Jude W. Shavlik,et al.  Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..

[13]  Tina Eliassi-Rad,et al.  Instructable and Adaptive Web Agents that Learn to Retrieve and Extract Information , 2000 .