Multiobjective Optimization Approach for Named Entity Recognition

In this paper, we propose a multiobjective optimization (MOO) based technique to determine the appropriate weight of voting for each class in each classifier for Named Entity Recognition (NER). Our underlying assumption is that reliability of predictions of each classifier differs among the various named entity (NE) classes. Thus, it is necessary to quantify the amount of voting for each class in a particular classifier. We use Maximum Entropy (ME) as the base to generate a number of classifiers depending upon the various feature representations. The proposed algorithm is evaluated for a resource-constrained language like Bengali that yield the overall recall, precision and F-measure values of 79.98%, 82.24% and 81.10%, respectively. Experiments also show that the classifier ensemble identified by the proposed multiobjective based technique outperforms all the individual classifiers, three different conventional baseline ensembles and an existing single objective optimization based approach.

[1]  Sivaji Bandyopadhyay,et al.  Voted NER System using Appropriate Unlabeled Data , 2009, NEWS@IJCNLP.

[2]  Dan I. Moldovan,et al.  Lexical Chains for Question Answering , 2002, COLING.

[3]  Kalyanmoy Deb,et al.  Multi-objective optimization using evolutionary algorithms , 2001, Wiley-Interscience series in systems and optimization.

[4]  Tong Zhang,et al.  Named Entity Recognition through Classifier Combination , 2003, CoNLL.

[5]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[6]  Bogdan Babych,et al.  Improving Machine Translation Quality with Automatic Named Entity Recognition , 2003, Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools Resources and Tools for Building MT - EAMT '03.

[7]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.

[8]  Ralph Grishman,et al.  Summarization System Integrated with Named Entity Tagging and IE pattern Discovery , 2002, LREC.

[9]  D. A. Preece,et al.  An introduction to the statistical analysis of data , 1979 .

[10]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[11]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.

[12]  Sivaji Bandyopadhyay,et al.  A web-based Bengali news corpus for named entity recognition , 2008, Lang. Resour. Evaluation.

[13]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[14]  Sivaji Bandyopadhyay,et al.  Web-Based Bengali News Corpus for Lexicon Development and POS Tagging , 2008, Polibits.

[15]  Asif Ekbal,et al.  Weighted Vote Based Classifier Ensemble Selection Using Genetic Algorithm for Named Entity Recognition , 2010, NLDB.