Parsing named entity as syntactic structure

Named entity recognition (NER) plays an important role in many natural language processing applications. This paper presents a novel approach to Chinese NER. It differentiates from most of the previous approaches mainly in three respects. First of all, while previous work is good at modeling features between observation elements, our model incorporates syntactic structure as higher level information. It is crucial for recognizing long named entities, which are one of the main difficulties of NER. Secondly, NER and syntactic analysis have been modeled separately in natural language processing until now. We integrate them in a unified framework. It allows the information from each type of annotation to improve performance on the other, and produces the consistent output. Finally, few studies have been reported on the recognition of nested named entities in Chinese. This paper presents a structured prediction model for Chinese nested named entity recognition. Our approach have been implemented through a joint representation of syntactic and named entity structures. We have provided empirical evidence that parsing model can utilize syntactic constraints for recognizing named entities, and exploit the composition patterns of named entities. Experiment results demonstrate the mutual benefits for each task and output syntactic structure of named entities.

[1]  Aitao Chen,et al.  Chinese Named Entity Recognition with Conditional Probabilistic Models , 2006, SIGHAN@COLING/ACL.

[2]  Yiqun Liu,et al.  Automatic Chinese name recognition based on web corpus analysis , 2007 .

[3]  Bo Chen,et al.  Chinese NER Using CRFs and Logic for the Fourth SIGHAN Bakeoff , 2008, IJCNLP.

[4]  Yue Zhang,et al.  Chinese Parsing Exploiting Characters , 2013, ACL.

[5]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[6]  Hwee Tou Ng,et al.  A Maximum Entropy Approach to Chinese Word Segmentation , 2005, SIGHAN@IJCNLP 2005.

[7]  Tao Zhang,et al.  Me-Based Chinese Person Name and Location Name Recognition Model , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[8]  Phil Blunsom,et al.  Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2009 .

[9]  F. Rudzicz Human Language Technologies : The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2010 .

[10]  Slav Petrov,et al.  Coarse-to-Fine Natural Language Processing , 2011, Theory and Applications of Natural Language Processing.

[11]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[12]  Hai Zhao,et al.  An Improved Chinese Word Segmentation System with Conditional Random Field , 2006, SIGHAN@COLING/ACL.

[13]  Christopher D. Manning,et al.  Joint Parsing and Named Entity Recognition , 2009, NAACL.

[14]  Yang Liu,et al.  Joint Chinese Word Segmentation, POS Tagging and Parsing , 2012, EMNLP-CoNLL.

[15]  Xihong Wu,et al.  Learning Grammar with Explicit Annotations for Subordinating Conjunctions , 2014, ACL.

[16]  Hai Zhao,et al.  Effective Tag Set Selection in Chinese Word Segmentation via Conditional Random Field Modeling , 2006, PACLIC.

[17]  Dongchen Li,et al.  Improved Chinese Parsing Using Named Entity Cue , 2013, IWPT.

[18]  Xiaoqiang Luo A Maximum Entropy Chinese Character-Based Parser , 2003, EMNLP.

[19]  Le Sun,et al.  Early results for Chinese named entity recognition using conditional random fields model, HMM and maximum entropy , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[20]  Xiaojun Wan,et al.  Named Entity Recognition in Chinese News Comments on the Web , 2011, IJCNLP.

[21]  Xihong Wu,et al.  Learning the Taxonomy of Function Words for Parsing , 2014, COLING.

[22]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.

[23]  Sameer Singh,et al.  Minimally-Supervised Extraction of Entities from Text Advertisements , 2010, NAACL.

[24]  Satoshi Sekine,et al.  Definition, Dictionaries and Tagger for Extended Named Entity Hierarchy , 2004, LREC.

[25]  Pascale Fung,et al.  A maximum-entropy chinese parser augmented by transformation-based learning , 2004, TALIP.

[26]  Michael Collins,et al.  Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[27]  Nianwen Xue,et al.  Chinese Word Segmentation as Character Tagging , 2003, ROCLING/IJCLCLP.

[28]  Mark Steedman,et al.  Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning , 2012 .

[29]  Stephen Clark,et al.  Syntactic Processing Using the Generalized Perceptron and Beam Search , 2011, CL.

[30]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[31]  Jeff A. Bilmes,et al.  Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers , 2006, HLT-NAACL 2006.

[32]  Stephen Clark,et al.  Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model , 2009, IWPT.

[33]  David Chiang,et al.  Two Statistical Parsing Models Applied to the Chinese Treebank , 2000, ACL 2000.