Adaptive information extraction

The growing availability of online textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extraction (IE) research. Some examples of these applications are the generation of data bases from documents, as well as the acquisition of knowledge useful for emerging technologies like question answering, information integration, and others related to text mining. However, one of the main drawbacks of the application of IE refers to its intrinsic domain dependence. For the sake of reducing the high cost of manually adapting IE applications to new domains, experiments with different Machine Learning (ML) techniques have been carried out by the research community. This survey describes and compares the main approaches to IE and the different ML techniques used to achieve Adaptive IE technology.

[1]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[2]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[3]  Bernd Thomas Anti-Unification Based Learning of T-Wrappers for Information Extraction , 1999 .

[4]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[5]  Hwee Tou Ng,et al.  Closing the Gap: Learning-Based Information Extraction Rivaling Knowledge-Engineering Methods , 2003, ACL.

[6]  Line Eikvil,et al.  Information Extraction from World Wide Web - A Survey , 1999 .

[7]  Sanda M. Harabagiu,et al.  Acquisition of Linguistic Patterns for Knowledge-based Information Extraction , 2000, LREC.

[8]  N. Kushmerick,et al.  Information Extraction by Convergent Boundary Classification , 2004 .

[9]  Hwee Tou Ng,et al.  A maximum entropy approach to information extraction from semi-structured and free text , 2002, AAAI/IAAI.

[10]  John D. Burger,et al.  MITRE-Bedford: description of the ALEMBIC system as used for MUC-4 , 1992, MUC.

[11]  Louise Guthrie,et al.  Lockheed Martin: LOUELLA PARSING, an NLToolset system for MUC-6 , 1995, MUC.

[12]  Ralph Grishman,et al.  New York University: Description of the PROTEUS System as Used for MUC-3 , 1991, MUC.

[13]  Chun-Nan Hsu,et al.  Initial Results on Wrapping Semistructured Web Pages with Finite-State Transducers and Contextual Rules , 1998 .

[14]  共立出版株式会社 コンピュータ・サイエンス : ACM computing surveys , 1978 .

[15]  Robert C. Berwick,et al.  Principle-Based Parsing: Computation and Psycholinguistics , 1991 .

[16]  Marc B. Vilain Inferential Information Extraction , 1999, SCIE.

[17]  Ralph M. Weischedel,et al.  BBN: description of the PLUM system as used for MUC-3 , 1991, MUC.

[18]  Scott B. Huffman,et al.  Learning information extraction patterns from examples , 1995, Learning for Natural Language Processing.

[19]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[20]  Dan I. Moldovan,et al.  Acquisition of Linguistic Patterns for Knowledge-Based Information Extraction , 1995, IEEE Trans. Knowl. Data Eng..

[21]  Wai Lam,et al.  Using Support Vector Machines for Terrorism Information Extraction , 2003, ISI.

[22]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[23]  Eneko Agirre,et al.  Word Sense Disambiguation using Conceptual Density , 1996, COLING.

[24]  P MarcusMitchell,et al.  Building a large annotated corpus of English , 1993 .

[25]  Claire Cardie,et al.  University of Massachusetts: Description of the CIRCUS System as Used for MUC-4 , 1992, MUC.

[26]  Dragomir R. Radev,et al.  Hierarchical text summarization for WAP-enabled mobile devices , 2005, SIGIR '05.

[27]  Nabil R. Adam,et al.  Information Extraction based Multiple-Category Document Classification for the Global Legal Information Network , 1997, AAAI/IAAI.

[28]  Ralph Grishman,et al.  New York University: description of the Proteus system as used for MUC-5 , 1993, MUC.

[29]  Yorick Wilks,et al.  University of Sheffield: Description of the LaSIE System as Used for MUC-6 , 1995, MUC.

[30]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[31]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[32]  Fabio Ciravegna,et al.  (LP) 2 , an Adaptive Algorithm for Information Extraction from Web-related Texts , 2001 .

[33]  Jerry R. Hobbs The Generic Information Extraction System , 1993, MUC.

[34]  William W. Cohen,et al.  A flexible learning system for wrapping tables and lists in HTML documents , 2002, WWW.

[35]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[36]  Dekang Lin,et al.  University of Manitoba: Description of the PIE System Used for MUC-6 , 1995, MUC.

[37]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[38]  Richard M. Schwartz,et al.  BBN: Description of the SIFT System as Used for MUC-7 , 1998, MUC.

[39]  Andrew McCallum,et al.  A Note on the Unification of Information Extraction and Data Mining using Conditional-Probability, Relational Models , 2003 .

[40]  Mark Craven,et al.  Learning to Extract Relations from MEDLINE , 1999 .

[41]  Ralph Grishman,et al.  Machine Learning of Extraction Patterns from Unannotated Corpora: Position Statement , 2000 .

[42]  Alan W. Biermann,et al.  Two Dimensional Generalization in Information Extraction , 1999, AAAI/IAAI.

[43]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[44]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[45]  Claire Cardie,et al.  UMass/Hughes: Description of the CIRCUS System Used for MUC-51 , 1993, MUC.

[46]  Boris Chidlovskii Wrapper generation by -reversible grammar induction , 2000 .

[47]  Claudio Giuliano,et al.  IE evaluation: Criticisms and recommendations , 2004 .

[48]  Lluís Màrquez,et al.  Proceedings of the Tenth Conference on Computational Natural Language Learning , 2006 .

[49]  Ralph Grishman,et al.  A Decision Tree Method for Finding and Classifying Names in Japanese Texts , 1998, VLC@COLING/ACL.

[50]  Tomek Strzalkowski Natural Language Information Retrieval , 1995, Inf. Process. Manag..

[51]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[52]  Leonid Peshkin,et al.  Bayesian Information Extraction Network , 2003, IJCAI.

[53]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[54]  Raymond J. Mooney,et al.  Inducing Deterministic Prolog Parsers from Treebanks: A Machine Learning Approach , 1994, AAAI.

[55]  Oren Glickman,et al.  Examining Machine Learning for Adaptable End-to-End Information Extraction Systems , 1999 .

[56]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[57]  Ralph Grishman,et al.  Automatic Acquisition of Domain Knowledge for Information Extraction , 2000, COLING.

[58]  Stephen Soderland,et al.  Learning to Extract Text-Based Information from the World Wide Web , 1997, KDD.

[59]  E. Dura Natural Language in Information Retrieval , 2003, CICLing.

[60]  Jonathan Aseltine WAVE: An Incremental Algorithm for Information Extraction , 1999 .

[61]  Roman Yangarber,et al.  Counter-Training in Discovery of Semantic Patterns , 2003, ACL.

[62]  牧野 恭子 マイニング技術を活用した営業活動革新--ナレッジマネジメント支援システムKnoowledge Meister (ビジネスプロセスに繋がるデータマイニング--概念と実践) -- (業務基幹系とマイニング系連携ソリューションの実例集!) , 2004 .

[63]  Dan Roth,et al.  Learning to Resolve Natural Language Ambiguities: A Unified Approach , 1998, AAAI/IAAI.

[64]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[65]  Kristina Lerman,et al.  A Machine Learning Approach to Accurately and Reliably Extracting Data from the Web , 2001 .

[66]  Robert Gaizauskas,et al.  Investigations into the grammar underlying the Penn Treebank II , 1995 .

[67]  Andrew McCallum,et al.  Information Extraction with HMMs and Shrinkage , 1999 .

[68]  Scott Bennett,et al.  Evaluating Automated and Manual Acquisition of Anaphora Resolution Strategies , 1995, ACL.

[69]  Christopher D. Manning,et al.  Template Sampling for Leveraging Domain Knowledge in Information Extraction , 2005 .

[70]  Nigel Collier,et al.  Use of Support Vector Machines in Extended Named Entity Recognition , 2002, CoNLL.

[71]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[72]  Dayne Freitag,et al.  Toward General-Purpose Learning for Information Extraction , 1998, ACL.

[73]  Mario Martín,et al.  A portable method for acquiring information extraction patterns without annotated corpora , 2003, Nat. Lang. Eng..

[74]  Claire Cardie,et al.  The CIRCUS System as Used in MUC-3 , 1991 .

[75]  Douglas E. Appelt,et al.  SRI International: description of the FASTUS system used for MUC-4 , 1992, MUC.

[76]  Roberto Basili,et al.  Corpus-driven learning of Event Recognition Rules , 2007 .

[77]  Nanda Kambhatla,et al.  Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations , 2004, ACL 2004.

[78]  David Fisher,et al.  Description of the UMass system as used for MUC-6 , 1995, MUC.

[79]  Horacio Rodríguez,et al.  Learning rules for information extraction , 2002, Nat. Lang. Eng..

[80]  Mark Craven,et al.  Representing Sentence Structure in Hidden Markov Models for Information Extraction , 2001, IJCAI.

[81]  Dan Roth,et al.  Relational Learning via Propositional Algorithms: An Information Extraction Case Study , 2001, IJCAI.

[82]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[83]  Dayne Freitag,et al.  Boosted Wrapper Induction , 2000, AAAI/IAAI.

[84]  Claire Cardie,et al.  Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms , 2003, EMNLP.

[85]  David Fisher,et al.  MITA: An Information-Extraction Approach to the Analysis of Free-Form Text in Life Insurance Applications , 1998, AI Mag..

[86]  Dmitry Zelenko,et al.  Kernel methods for relation extraction , 2003 .

[87]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[88]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[89]  Claire Cardie Machine learning for natural language processing (and vice versa , 2005 .

[90]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[91]  Mark Craven,et al.  Hierarchical Hidden Markov Models for Information Extraction , 2003, IJCAI.

[92]  Tom Hampton,et al.  SRA: Description of the IE2 System Used for MUC-7 , 1998, MUC.

[93]  David Fisher,et al.  CRYSTAL: Inducing a Conceptual Dictionary , 1995, IJCAI.

[94]  William W. Cohen A structured wrapper induction system for extracting information from semi-structured documents , 2001, IJCAI 2001.

[95]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[96]  Dayne Freitag,et al.  Machine Learning for Information Extraction in Informal Domains , 2000, Machine Learning.

[97]  Ryszard S. Michalski,et al.  Toward a unified theory of learning: multistrategy task-adaptive learning , 1993 .

[98]  Scott Miller,et al.  A Novel Use of Statistical Parsing to Extract Information from Text , 2000, ANLP.

[99]  Steve Young,et al.  Corpus-based methods in language and speech processing , 1997 .

[100]  Neus Català Roig,et al.  Acquiring information extraction patterns from unannotated corpora , 2003 .

[101]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[102]  Stephen Muggleton,et al.  Machine Invention of First Order Predicates by Inverting Resolution , 1988, ML.

[103]  Ralph Grishman,et al.  Scenario customization for information extraction , 2000 .

[104]  Craig A. Knoblock,et al.  A hierarchical approach to wrapper induction , 1999, AGENTS '99.

[105]  Vibhu O. Mittal,et al.  Applying Machine Learning for High‐Performance Named‐Entity Extraction , 2000, Comput. Intell..

[106]  Raymond J. Mooney,et al.  Active Learning for Natural Language Parsing and Information Extraction , 1999, ICML.

[107]  Maarten de Rijke,et al.  Wrapper Generation via Grammar Induction , 2000, ECML.

[108]  Claire Cardie,et al.  University of Massachusetts: Description of the CIRCUS System as Used for MUC-3 , 1991, MUC.

[109]  Ralph Grishman,et al.  NYU: Description of the Proteus/PET System as Used for MUC-7 ST , 1998, MUC.

[110]  K. Minton Extraction Patterns for Information Extraction Tasks : A Survey , 1999 .

[111]  Sanda Harabagiu,et al.  High-performance, open-domain question answering from large text collections , 2001 .

[112]  Lynette Hirschman,et al.  MITRE: Description of the Alembic System Used for MUC-6 , 1995, MUC.

[113]  Mark Smith,et al.  University of Durham: description of the LOLITA system as used in MUC-6 , 1995, MUC.

[114]  Mario Martín,et al.  Essence: A Portable Methodology for Acquiring Information Extraction Patterns , 2000, ECAI.

[115]  William W. Cohen WHIRL: A word-based information representation language , 2000, Artif. Intell..

[116]  George R. Krupka SRA: Description of the SRA System as Used for MUC-6 , 1995, MUC.

[117]  Alan W. Biermann,et al.  The Use of Lexical Semantics in Information Extraction , 1997 .

[118]  Ralph Grishman,et al.  Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition , 1998, VLC@COLING/ACL.

[119]  Ralph M. Weischedel,et al.  BEN: description of the PLUM system as used for MUC-6 , 1995, MUC.

[120]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[121]  M. Cali,et al.  Relational learning techniques for natural language information extraction , 1998 .

[122]  Raymond J. Mooney,et al.  Relational learning techniques for natural language information extraction , 1998 .

[123]  Wendy G. Lehnert,et al.  Using Decision Trees for Coreference Resolution , 1995, IJCAI.

[124]  Ruslan Mitkov,et al.  Robust Pronoun Resolution with Limited Knowledge , 1998, ACL.

[125]  Craig A. Knoblock,et al.  Hierarchical Wrapper Induction for Semistructured Information Sources , 2004, Autonomous Agents and Multi-Agent Systems.

[126]  Douglas E. Appelt,et al.  FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[127]  Ion Muslea,et al.  Extraction Patterns for Information Extraction Tasks: A Survey , 1999 .

[128]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[129]  Douglas E. Appelt,et al.  SRI: Description of the JV-FASTUS System Used for MUC-5 , 1993, MUC.

[130]  Jun-ichi Fukumoto,et al.  Description of the Oki System as Used for MUC-7 , 1998, MUC.

[131]  Stephen Soderland,et al.  Learning Information Extraction Rules for Semi-Structured and Free Text , 1999, Machine Learning.

[132]  Claire Cardie,et al.  Noun Phrase Coreference as Clustering , 1999, EMNLP.

[133]  Yorick Wilks,et al.  University of Sheffield: description of the LaSIE system as used for MUC-6 , 1995, MUC.

[134]  Roni Rosenfeld,et al.  Learning Hidden Markov Model Structure for Information Extraction , 1999 .

[135]  Herbert Gish,et al.  BBN: Description of the PLUM System as Used for MUC-5 , 2005, MUC.