Ensemble Learning for Named Entity Recognition

A considerable portion of the information on the Web is still only available in unstructured form. Implementing the vision of the Semantic Web thus requires transforming this unstructured data into structured data. One key step during this process is the recognition of named entities. Previous works suggest that ensemble learning can be used to improve the performance of named entity recognition tools. However, no comparison of the performance of existing supervised machine learning approaches on this task has been presented so far. We address this research gap by presenting a thorough evaluation of named entity recognition based on ensemble learning. To this end, we combine four different state-of-the approaches by using 15 different algorithms for ensemble learning and evaluate their performace on five different datasets. Our results suggest that ensemble learning can reduce the error rate of state-of-the-art named entity recognition systems by 40%, thereby leading to over 95% f-score in our best run.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Stan Matwin,et al.  Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity , 2006, Canadian AI.

[3]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[4]  Sam Coates-Stephens,et al.  The Analysis and Acquisition of Proper Names for the Understanding of Free Text , 1992, Comput. Humanit..

[5]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[6]  Luís Torgo,et al.  Knowledge Discovery in Databases: PKDD 2005, 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings , 2005, PKDD.

[7]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[8]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[9]  Aldo Gangemi,et al.  A Comparison of Knowledge Extraction Tools for the Semantic Web , 2013, ESWC.

[10]  Erdogan Dogdu,et al.  Named entity recognition and disambiguation using linked data and graph-based centrality scoring , 2012, SWIM '12.

[11]  Marine Carpuat,et al.  A Stacked, Voted, Stacked Model for Named Entity Recognition , 2003, CoNLL.

[12]  Eibe Frank,et al.  Logistic Model Trees , 2003, ECML.

[13]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[14]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[15]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[16]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[17]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[18]  João Gama,et al.  Functional Trees , 2001, Machine Learning.

[19]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[20]  Robert A. Amsler,et al.  Research toward the development of a lexical knowledge base for natural language processing , 1989, SIGIR '89.

[21]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[22]  Jon Atli Benediktsson,et al.  Multiple Classifier Systems , 2015, Lecture Notes in Computer Science.

[23]  Stefan Wrobel,et al.  Machine Learning: ECML-95 , 1995, Lecture Notes in Computer Science.

[24]  Lora Aroyo,et al.  The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I , 2011, SEMWEB.

[25]  Sebastian Hellmann,et al.  N³ - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format , 2014, LREC.

[26]  José Luis Borbinha,et al.  An Approach for Named Entity Recognition in Poorly Structured Data , 2012, ESWC.

[27]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[28]  Albert Y. Zomaya,et al.  A Review of Ensemble Methods in Bioinformatics , 2010, Current Bioinformatics.

[29]  Xavier Carreras,et al.  Proceedings of the Thirteenth Conference on Computational Natural Language Learning, CoNLL 2009, Boulder, Colorado, USA, June 4-5, 2009 , 2009, CoNLL.

[30]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[31]  Axel-Cyrille Ngonga Ngomo,et al.  SCMS - Semantifying Content Management Systems , 2011, SEMWEB.

[32]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[33]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[34]  Christine Thielen,et al.  An Approach to Proper Name Tagging for German , 1995, cmp-lg/9506024.

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  Oscar Corcho,et al.  The Semantic Web: Semantics and Big Data , 2013, Lecture Notes in Computer Science.

[37]  David Nadeau,et al.  Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision , 2007 .

[38]  Geoffrey Sampson,et al.  How Fully Does a Machine-Usable Dictionary Cover English Text? , 1989 .

[39]  Jeffrey P. Bigham,et al.  Organizing and Searching the World Wide Web of Facts - Step One: The One-Million Fact Extraction Challenge , 2006, AAAI.

[40]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[41]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[42]  Robert E. Schapire,et al.  The strength of weak learnability , 1990, Mach. Learn..

[43]  M. I. Jordan Leo Breiman , 2011, 1101.0929.

[44]  Massimiliano Ciaramita,et al.  A framework for benchmarking entity-annotation systems , 2013, WWW.

[45]  James R. Curran,et al.  Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[46]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[47]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Ali Khalili,et al.  RDFaCE – The RDFa Content , 2011 .

[49]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[50]  Eibe Frank,et al.  Speeding Up Logistic Model Tree Induction , 2005, PKDD.