Classification of entities via their descriptive sentences

Hypernym identification of open-domain entities is crucial for taxonomy construction as well as many higher-level applications. Current methods suffer from either low precision or low recall. To decrease the difficulty of this problem, we adopt a classification-based method. We pre-define a concept taxonomy and classify an entity to one of its leaf concept, based on the name and description information of the entity. A convolutional neural network classifier and a K-means clustering module are adopted for classification. We applied this system to 2.1 million Baidu Baike entities, and 1.1 million of them were successfully identified with a precision of 99.36%.

[1]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[2]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[3]  David Nadeau,et al.  Semi-supervised named entity recognition: learning to recognize 100 entity types with little supervision , 2007 .

[4]  Oren Etzioni,et al.  What Is This, Anyway: Automatic Hypernym Discovery , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[5]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[6]  Stefano Faralli,et al.  A Large DataBase of Hypernymy Relations Extracted from the Web , 2016, LREC.

[7]  Eduard H. Hovy,et al.  Fine Grained Classification of Named Entities , 2002, COLING.

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Ralph Grishman,et al.  Message Understanding Conference- 6: A Brief History , 1996, COLING.

[10]  Katja Hofmann,et al.  Lexical Patterns or Dependency Patterns: Which Is Better for Hypernym Extraction? , 2009, CoNLL.

[11]  Kentaro Torisawa,et al.  Exploiting Wikipedia as External Knowledge for Named Entity Recognition , 2007, EMNLP.

[12]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[13]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[14]  Graciela Gonzalez,et al.  BANNER: An Executable Survey of Advances in Biomedical Named Entity Recognition , 2007, Pacific Symposium on Biocomputing.

[15]  Johannes Fürnkranz,et al.  Unsupervised generation of data mining features from linked open data , 2012, WIMS '12.

[16]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[17]  Marti A. Hearst Automatic Acquisition of Hyponyms , 1992 .

[18]  Zornitsa Kozareva,et al.  A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web , 2010, EMNLP.