Learning Entity Representation for Entity Disambiguation

In this paper we present a novel disambiguation model, based on neural networks. Most existing studies focus on designing effective man-made features and complicated similarity measures to obtain better disambiguation performance. Instead, our method learns distributed representation of entity to measure similarity without man-made features. Entity representation consists of context document representation and category representation. Document representation of an entity is learned based on deep neural network (DNN), and is directly optimized for a given similarity measure. Convolutional neural network (CNN) is employed to obtain category representation, and shares deep layers with DNN. Both models are trained jointly using massive documents collected from Baike http://baike.baidu.com/. Experiment results show that our method achieves a good performance on two datasets without any manually designed features.

[1]  Rajeev Rastogi,et al.  Entity disambiguation with hierarchical topic models , 2011, KDD.

[2]  Ganesh Ramakrishnan,et al.  Collective annotation of Wikipedia entities in web text , 2009, KDD.

[3]  Jian Su,et al.  Entity Linking with Effective Acronym Expansion, Instance Selection, and Topic Modeling , 2011, IJCAI.

[4]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[5]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[6]  Yoshua Bengio,et al.  Large-Scale Learning of Embeddings with Reconstruction Sampling , 2011, ICML.

[7]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[8]  Christoph Goller,et al.  Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[9]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[10]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[11]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[12]  Lidia S. Chao,et al.  A Template Based Hybrid Model for Chinese Personal Name Disambiguation , 2012, CIPS-SIGHAN.

[13]  Han Wei,et al.  Attribute based Chinese Named Entity Recognition and Disambiguation , 2012, CIPS-SIGHAN.

[14]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[15]  Ying Shi,et al.  LCC Approaches to Knowledge Base Population at TAC 2010 , 2010, TAC.

[16]  Yang Song,et al.  Efficient topic-based unsupervised name disambiguation , 2007, JCDL '07.

[17]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[18]  Prithviraj Sen,et al.  Collective context-aware topic models for entity disambiguation , 2012, WWW.

[19]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[20]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[21]  Jun Zhao,et al.  Collective entity linking in web text: a graph-based method , 2011, SIGIR.

[22]  Xiao Pan,et al.  Chinese Name Disambiguation Based on Adaptive Clustering with the Attribute Features , 2012, CIPS-SIGHAN.

[23]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[24]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[25]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[26]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[27]  Noah A. Smith,et al.  Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[28]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[29]  Robert J. Gaizauskas,et al.  Graph Ranking for Collective Named Entity Disambiguation , 2014, ACL.

[30]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[31]  Takahiro Hara,et al.  Entity Disambiguation based on a Probabilistic Taxonomy , 2011 .

[32]  Ioannis Korkontzelos,et al.  Estimating Linear Models for Compositional Distributional Semantics , 2010, COLING.

[33]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[34]  Xianpei Han,et al.  SIR-NERD: A Chinese Named Entity Recognition and Disambiguation System using a Two-Stage Method , 2012, CIPS-SIGHAN.

[35]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[36]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[37]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[38]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.