Entity Linking Meets Deep Learning: Techniques and Solutions

Entity linking (EL) is the process of linking entity mentions appearing in web text with their corresponding entities in a knowledge base. EL plays an important role in the fields of knowledge engineering and data mining, underlying a variety of downstream applications such as knowledge base population, content analysis, relation extraction, and question answering. In recent years, deep learning (DL), which has achieved tremendous success in various domains, has also been leveraged in EL methods to surpass traditional machine learning based methods and yield the state-of-the-art performance. In this survey, we present a comprehensive review and analysis of existing DL based EL methods. First of all, we propose a new taxonomy, which organizes existing DL based EL methods using three axes: embedding, feature, and algorithm. Then we systematically survey the representative EL methods along the three axes of the taxonomy. Later, we introduce ten commonly used EL data sets and give a quantitative performance analysis of DL based EL methods over these data sets. Finally, we discuss the remaining limitations of existing methods and highlight some promising future directions.

[1]  Kai Zheng,et al.  Microblog Entity Linking with Social Temporal Context , 2015, SIGMOD Conference.

[2]  E. Cambria,et al.  Deep Learning--based Text Classification , 2020, ACM Comput. Surv..

[3]  Fabian M. Suchanek,et al.  A Lightweight Neural Model for Biomedical Entity Linking , 2020, AAAI.

[4]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[5]  Raphaël Troncy,et al.  GERBIL: General Entity Annotator Benchmarking Framework , 2015, WWW.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Jun Zhao,et al.  A Joint Model for Question Answering over Multiple Knowledge Bases , 2016, AAAI.

[8]  Yi Tay,et al.  Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All , 2018, IEEE Transactions on Knowledge and Data Engineering.

[9]  Shuang Chen,et al.  Improving Entity Linking by Modeling Latent Entity Type Information , 2020, AAAI.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[13]  Larry P. Heck,et al.  Leveraging Deep Neural Networks and Knowledge Graphs for Entity Disambiguation , 2015, ArXiv.

[14]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[15]  Mohammad Sadoghi,et al.  Joint Learning of Local and Global Features for Entity Linking via Neural Networks , 2016, COLING.

[16]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[17]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[18]  Ming Zhou,et al.  Entity Linking for Queries by Searching Wikipedia Sentences , 2017, EMNLP.

[19]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[20]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[21]  Omer Levy,et al.  Named Entity Disambiguation for Noisy Text , 2017, CoNLL.

[22]  Vasudeva Varma,et al.  IIIT Hyderabad at TAC 2009 , 2008, TAC.

[23]  Sean Monahan,et al.  Cross-Lingual Cross-Document Coreference with Entity Linking , 2011, TAC.

[24]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[25]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[26]  Dan Klein,et al.  Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks , 2016, NAACL.

[27]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[28]  Ming-Wei Chang,et al.  Zero-Shot Entity Linking by Reading Entity Descriptions , 2019, ACL.

[29]  Wei Shen,et al.  LIEGE:: link entities in web lists with knowledge base , 2012, KDD.

[30]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[31]  Zhiyuan Liu,et al.  Neural Collective Entity Linking , 2018, COLING.

[32]  Linmei Hu,et al.  Graph neural entity disambiguation , 2020, Knowl. Based Syst..

[33]  Zhaochen Guo,et al.  Robust named entity disambiguation with random walks , 2018, Semantic Web.

[34]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[35]  Gerhard Paass,et al.  From names to entities using thematic context distance , 2011, CIKM '11.

[36]  Chenliang Li,et al.  A Survey on Deep Learning for Named Entity Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[37]  Hongyu Guo,et al.  Dynamic Graph Convolutional Networks for Entity Linking , 2020, WWW.

[38]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[39]  Yueting Zhuang,et al.  Learning Dynamic Context Augmentation for Global Entity Linking , 2019, EMNLP.

[40]  Michael Ley,et al.  DBLP - Some Lessons Learned , 2009, Proc. VLDB Endow..

[41]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[42]  Heng Ji,et al.  Knowledge Base Population: Successful Approaches and Challenges , 2011, ACL.

[43]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[44]  Krisztian Balog,et al.  REL: An Entity Linker Standing on the Shoulders of Giants , 2020, SIGIR.

[45]  Feng Hou,et al.  Improving Entity Linking through Semantic Reinforced Entity Embeddings , 2020, ACL.

[46]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[47]  Xiaolong Wang,et al.  Modeling Mention, Context and Entity with Neural Networks for Entity Disambiguation , 2015, IJCAI.

[48]  Yubao Liu,et al.  Neural Collective Entity Linking Based on Recurrent Random Walk Network Learning , 2019, IJCAI.

[49]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[50]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[51]  Wei Shen,et al.  LINDEN: linking named entities with knowledge base via semantic knowledge , 2012, WWW.

[52]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[53]  Omar Adjali,et al.  Multimodal Entity Linking for Tweets , 2020, ECIR.

[54]  Greg Durrett,et al.  Effective Use of Context in Noisy Entity Linking , 2018, EMNLP.

[55]  Jian Su,et al.  NUS-I2R: Learning a Combined System for Entity Linking , 2010, TAC.

[56]  Fernando Pereira,et al.  Collective Entity Resolution with Multi-Focal Attention , 2016, ACL.

[57]  Zhaochen Guo,et al.  Robust Entity Linking via Random Walks , 2014, CIKM.

[58]  Yi Yang,et al.  Collective Entity Disambiguation with Structured Gradient Tree Boosting , 2018, NAACL.

[59]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[60]  Thomas Hofmann,et al.  Deep Joint Entity Disambiguation with Local Neural Attention , 2017, EMNLP.

[61]  Hiroyuki Shindo,et al.  Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation , 2016, CoNLL.

[62]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[63]  Ni Lao,et al.  Relational retrieval using a combination of path-constrained random walks , 2010, Machine Learning.

[64]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[65]  Johannes Hoffart,et al.  CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata , 2021, EACL.

[66]  Houfeng Wang,et al.  Learning Entity Representation for Entity Disambiguation , 2013, ACL.

[67]  Dominique Ritze,et al.  Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases , 2016, WWW.

[68]  Omer Levy,et al.  SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.

[69]  Thomas Hofmann,et al.  End-to-End Neural Entity Linking , 2018, CoNLL.

[70]  Zhoujun Li,et al.  Named entity disambiguation for questions in community question answering , 2017, Knowl. Based Syst..

[71]  Neural Entity Linking: A Survey of Models based on Deep Learning , 2020, ArXiv.

[72]  Matthew Michelson,et al.  Tweet Disambiguate Entities Retrieve Folksonomy SubTree Step 1 : Discover Categories Generate Topic Profile from SubTrees Step 2 : Discover Profile Topic Profile : “ English Football ” “ World Cup ” , 2010 .

[73]  Ivan Titov,et al.  Distant Learning for Entity Linking with Automatic Noise Detection , 2019, ACL.

[74]  Dunja Mladenic,et al.  Entity Resolution in Texts Using Statistical Learning and Ontologies , 2009, ASWC.

[75]  Beihong Jin,et al.  A Bidirectional Multi-paragraph Reading Model for Zero-shot Entity Linking , 2021, AAAI.

[76]  Xiaojie Yuan,et al.  Toward Tweet Entity Linking With Heterogeneous Information Networks , 2022, IEEE Transactions on Knowledge and Data Engineering.

[77]  Marie-Jean Meurs,et al.  Improving Entity Linking using Surface Form Refinement , 2014, LREC.

[78]  Sebastian Hellmann,et al.  N³ - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format , 2014, LREC.

[79]  Samuel Broscheit,et al.  Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking , 2019, CoNLL.

[80]  Michael Granitzer,et al.  Robust and Collective Entity Disambiguation through Semantic Embeddings , 2016, SIGIR.

[81]  Doug Downey,et al.  TabEL: Entity Linking in Web Tables , 2015, SEMWEB.

[82]  Olivier Raiman,et al.  DeepType: Multilingual Entity Linking by Neural Type System Evolution , 2018, AAAI.

[83]  Mei Wang,et al.  Deep Visual Domain Adaptation: A Survey , 2018, Neurocomputing.

[84]  Zheng Fang,et al.  High Quality Candidate Generation and Sequential Graph Attention Network for Entity Linking , 2020, WWW.

[85]  Massimiliano Ciaramita,et al.  A framework for benchmarking entity-annotation systems , 2013, WWW.

[86]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[87]  Hiroyuki Shindo,et al.  Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities , 2019 .

[88]  Xiaojie Yuan,et al.  SHINE+: A General Framework for Domain-Specific Entity Linking with Heterogeneous Information Networks , 2018, IEEE Transactions on Knowledge and Data Engineering.

[89]  Xu Chen,et al.  Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding , 2017, ACL.

[90]  Denilson Barbosa,et al.  Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss , 2018, NAACL.

[91]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[92]  Xiaoli Z. Fern,et al.  Entity-aware ELMo: Learning Contextual Entity Representation for Entity Disambiguation , 2019, ArXiv.

[93]  Alexander Panchenko,et al.  Improving Neural Entity Disambiguation with Graph Embeddings , 2019, ACL.

[94]  Leonardo Neves,et al.  Multimodal Named Entity Disambiguation for Noisy Social Media Posts , 2018, ACL.

[95]  Nicholas Jing Yuan,et al.  Read, Retrospect, Select: An MRC Framework to Short Text Entity Linking , 2021, AAAI.

[96]  Xiaojie Yuan,et al.  Joint Open Knowledge Base Canonicalization and Linking , 2021, SIGMOD Conference.

[97]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[98]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[99]  Krisztian Balog,et al.  Novel Entity Discovery from Web Tables , 2020, WWW.

[100]  Hinrich Schütze,et al.  A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries , 2016, WWW.

[101]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[102]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[103]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[104]  Ivan Titov,et al.  Improving Entity Linking by Modeling Latent Relations between Mentions , 2018, ACL.

[105]  Jason Baldridge,et al.  Learning Dense Representations for Entity Retrieval , 2019, CoNLL.

[106]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[107]  Zijian Li,et al.  KBPearl , 2020, Proc. VLDB Endow..

[108]  Ming-Wei Chang,et al.  To Link or Not to Link? A Study on End-to-End Tweet Entity Linking , 2013, NAACL.

[109]  Yoshua Bengio,et al.  Deep Learning of Representations: Looking Forward , 2013, SLSP.

[110]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[111]  Gourab Kundu,et al.  Neural Cross-Lingual Entity Linking , 2017, AAAI.

[112]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[113]  John A. Hamilton,et al.  To Link or Not to Link , 1981 .

[114]  Qiang Yang,et al.  Attention-Based Multimodal Entity Linking with High-Quality Images , 2021, DASFAA.

[115]  Theodoros Rekatsinas,et al.  Deep Learning for Entity Matching: A Design Space Exploration , 2018, SIGMOD Conference.

[116]  Yasumasa Onoe,et al.  Fine-Grained Entity Typing for Domain Independent Entity Linking , 2020, AAAI.

[117]  Ivan Titov,et al.  Boosting Entity Linking Performance by Leveraging Unlabeled Documents , 2019, ACL.

[118]  Wen-tau Yih,et al.  Efficient One-Pass End-to-End Entity Linking for Questions , 2020, EMNLP.

[119]  Ming Li,et al.  Entity Disambiguation by Knowledge and Text Jointly Embedding , 2016, CoNLL.

[120]  Yi Tay,et al.  NeuPL: Attention-based Semantic Matching and Pair-Linking for Entity Disambiguation , 2017, CIKM.

[121]  Heng Ji,et al.  Collaborative Ranking: A Case Study on Entity Linking , 2011, EMNLP.

[122]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[123]  Giuseppe Ottaviano,et al.  Fast and Space-Efficient Entity Linking for Queries , 2015, WSDM.

[124]  Yanan Cao,et al.  Joint Entity Linking with Deep Reinforcement Learning , 2019, WWW.

[125]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[126]  Gerhard Weikum,et al.  KORE: keyphrase overlap relatedness for entity disambiguation , 2012, CIKM.

[127]  Luke Zettlemoyer,et al.  Zero-shot Entity Linking with Dense Entity Retrieval , 2019, ArXiv.

[128]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[129]  Dan Roth,et al.  Entity Linking via Joint Encoding of Types, Descriptions, and Context , 2017, EMNLP.

[130]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.