Noun Compound and Named Entity Recognition and their Usability in Keyphrase Extraction

We investigate how the automatic identification of noun compounds and named entities can contribute to keyphrase extraction and we also show how previously identified noun compounds affect named entity recognition and vice versa, how noun compound detection is supported by identified named entities. Our experiments demonstrate that already known noun compounds yield better performance in named entity recognition and already known named entities enhance noun compound detection. The integration of noun compound and named entity related features into a keyphrase extractor also proves to be more effective than the model not including them. Our results indicate that the above features tend to be beneficial in several NLP-related tasks.

[1]  Peter D. Turney Coherent Keyphrase Extraction via Web Mining , 2003, IJCAI.

[2]  Kim Nam Su,et al.  Statistical modeling of multiword expressions , 2008 .

[3]  Ray Jackendoff,et al.  The Architecture of the Language Faculty , 1996 .

[4]  Yi-fang Brook Wu,et al.  Domain-specific keyphrase extraction , 2005, CIKM '05.

[5]  Carlos Ramisch,et al.  Multiword Expressions in the wild? The mwetoolkit comes in handy , 2010, COLING.

[6]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[7]  Ralph Grishman,et al.  Design of the MUC-6 evaluation , 1995, MUC.

[8]  Timothy Baldwin,et al.  SemEval-2010 Task 5 : Automatic Keyphrase Extraction from Scientific Articles , 2010, *SEMEVAL.

[9]  Nancy Chinchor,et al.  Overview of MUC-7 , 1998, MUC.

[10]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition , 2002, CoNLL.

[11]  András Kocsor,et al.  A Multilingual Named Entity Recognition System Using Boosting and C4.5 Decision Tree Learning Algorithms , 2006, Discovery Science.

[12]  Timothy Baldwin,et al.  Multiword Expressions: A Pain in the Neck for NLP , 2002, CICLing.

[13]  István Hegedüs,et al.  Automatic free-text-tagging of online news archives , 2010, ECAI.

[14]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[15]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[16]  Veronika Vincze,et al.  Multiword Expressions and Named Entities in the Wiki50 Corpus , 2011, RANLP.