Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition

In this paper, we propose a two-stage multiobjective-simulated annealing (MOSA)-based technique for named entity recognition (NER). At first, MOSA is used for feature selection under two statistical classifiers, viz. conditional random field (CRF) and support vector machine (SVM). Each solution on the final Pareto optimal front provides a different classifier. These classifiers are then combined together by using a new classifier ensemble technique based on MOSA. Several different versions of the objective functions are exploited. We hypothesize that the reliability of prediction of each classifier differs among the various output classes. Thus, in an ensemble system, it is necessary to find out the appropriate weight of vote for each output class in each classifier. We propose a MOSA-based technique to determine the weights for votes automatically. The proposed two-stage technique is evaluated for NER in Bengali, a resource-poor language, as well as for English. Evaluation results yield the highest recall, precision and F-measure values of 93.95, 95.15 and 94.55 %, respectively for Bengali and 89.01, 89.35 and 89.18 %, respectively for English. Experiments also suggest that the classifier ensemble identified by the proposed MOO-based approach optimizing the F-measure values of named entity (NE) boundary detection outperforms all the individual classifiers and four conventional baseline models.

[1]  Dekang Lin,et al.  Phrase Clustering for Discriminative Learning , 2009, ACL.

[2]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[3]  Sivaji Bandyopadhyay,et al.  Web-Based Bengali News Corpus for Lexicon Development and POS Tagging , 2008, Polibits.

[4]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[5]  David Pinto,et al.  An Unsupervised Method for Senses Clustering , 2005, IICAI.

[6]  Christoph M. Friedrich,et al.  User's Choice of Precision and Recall in Named Entity Recognition , 2009, RANLP.

[7]  Satoshi Sekine,et al.  Description of the Japanese NE System Used for MET-2 , 1998, MUC.

[8]  Scott W. Bennett,et al.  Learning to Tag Multilingual Texts Through Observation , 1997, EMNLP.

[9]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Ralph Grishman,et al.  A Maximum Entropy Approach to Named Entity Recognition , 1999 .

[12]  Suresh Manandhar,et al.  An Unsupervised Method for General Named Entity Recognition and Automated Concept Discovery , 2004 .

[13]  Ujjwal Maulik,et al.  A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA , 2008, IEEE Transactions on Evolutionary Computation.

[14]  Hisao Ishibuchi,et al.  Performance evaluation of evolutionary multiobjective approaches to the design of fuzzy rule-based ensemble classifiers , 2005, Fifth International Conference on Hybrid Intelligent Systems (HIS'05).

[15]  Asif Ekbal,et al.  Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Ameneh Boroomand Conditional Random Field , 2010, Encyclopedia of Machine Learning.

[18]  Shailendra Kadre,et al.  Introduction to Statistical Analysis , 2015 .

[19]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[20]  Sivaji Bandyopadhyay,et al.  Bengali Named Entity Recognition Using Support Vector Machine , 2008, IJCNLP.

[21]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[22]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition and transliteration in Bengali , 2007 .

[23]  Hwee Tou Ng,et al.  Named Entity Recognition with a Maximum Entropy Approach , 2003, CoNLL.

[24]  Asif Ekbal,et al.  Weighted Vote-Based Classifier Ensemble for Named Entity Recognition: A Genetic Algorithm-Based Approach , 2011, TALIP.

[25]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[26]  Satoshi Sekine,et al.  Named Entity Discovery Using Comparable News Articles , 2004, COLING.

[27]  Asif Ekbal,et al.  A multiobjective simulated annealing approach for classifier ensemble: Named entity recognition in Indian languages as case studies , 2011, Expert Syst. Appl..

[28]  Joe F. Zhou,et al.  Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, : 21-22 June 1999, University of Maryland, College Park, MD, USA , 1999 .

[29]  Sivaji Bandyopadhyay,et al.  A Conditional Random Field Approach for Named Entity Recognition in Bengali and Hindi , 2009 .

[30]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[31]  Dan Klein,et al.  Named Entity Recognition with Character-Level Models , 2003, CoNLL.

[32]  Ralph Grishman,et al.  Unsupervised Learning of Generalized Names , 2002, COLING.

[33]  Asif Ekbal,et al.  Classifier Ensemble Selection Using Genetic Algorithm for Named Entity Recognition , 2010 .

[34]  Marc Moens,et al.  Description of the LTG System Used for MUC-7 , 1998, MUC.

[35]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[36]  Zhipeng Luo,et al.  Conditional Random Fields , 2014 .

[37]  Marine Carpuat,et al.  A Stacked, Voted, Stacked Model for Named Entity Recognition , 2003, CoNLL.

[38]  Kevin J. Cherkauer Human Expert-level Performance on a Scientiic Image Analysis Task by a System Using Combined Artiicial Neural Networks , 1996 .

[39]  Douglas H. Norrie,et al.  Agent-Based Systems for Intelligent Manufacturing: A State-of-the-Art Survey , 1999, Knowledge and Information Systems.

[40]  Sivaji Bandyopadhyay,et al.  A web-based Bengali news corpus for named entity recognition , 2008, Lang. Resour. Evaluation.

[41]  Carlos A. Coello Coello,et al.  A Comprehensive Survey of Evolutionary-Based Multiobjective Optimization Techniques , 1999, Knowledge and Information Systems.

[42]  Yoram Singer,et al.  Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[43]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[44]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[45]  Xiaofeng Yu Chinese Named Entity Recognition with Cascaded Hybrid Model , 2007, HLT-NAACL.

[46]  Sivaji Bandyopadhyay,et al.  Voted NER System using Appropriate Unlabeled Data , 2009, NEWS@IJCNLP.

[47]  John F. Kolen,et al.  Backpropagation is Sensitive to Initial Conditions , 1990, Complex Syst..

[48]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[49]  Richard M. Schwartz,et al.  BBN: Description of the SIFT System as Used for MUC-7 , 1998, MUC.

[50]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[51]  Rohini K. Srihari,et al.  A Hybrid Approach for Named Entity and Sub-Type Tagging , 2000, ANLP.

[52]  Jun Suzuki,et al.  Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data , 2008, ACL.

[53]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[54]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[55]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 shared task , 2003 .

[56]  D. A. Preece,et al.  An introduction to the statistical analysis of data , 1979 .

[57]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[58]  Ralph Grishman,et al.  NYU: Description of the MENE Named Entity System as Used in MUC-7 , 1998, MUC.

[59]  SchwartzRichard,et al.  An Algorithm that Learns Whats in a Name , 1999 .

[60]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[61]  Sivaji Bandyopadhyay,et al.  Named Entity Recognition in Bengali: A Conditional Random Field Approach , 2008, IJCNLP.

[62]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.