Author Disambiguation through Adversarial Network Representation Learning

Many persons share with the same name. Distinguishing different persons with the same name is important but challenging. Albeit much work has been proposed for author disambiguation, most of them do not adequately consider the heterogeneous relationships among authors and papers. In our work, ambiguous names and their related information, such as papers, conferences, titles, abstracts, etc., are constructed into a heterogeneous network which consists of different edge types. To fully incorporate all the information of the constructed network, we use Generative Adversarial Networks (GAN) to learn the network representation of the heterogeneous network. Although GAN has been used in many fields such as image generation, it hasn’t been used to obtain representations for the heterogeneous network. As far as we know, our work is the first work which use adversarial training to learn heterogeneous network representation. After the representations are learned, they are partitioned into different groups each representing distinct authors. After extensive experiments on three major author disambiguation datasets, we demonstrate that our method outperforms several state-of-the-art baselines in author disambiguation problem.

[1]  Minyi Guo,et al.  GraphGAN: Graph Representation Learning with Generative Adversarial Nets , 2017, AAAI.

[2]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[3]  Michalis Vazirgiannis,et al.  Quality Scheme Assessment in the Clustering Process , 2000, PKDD.

[4]  Dan Wang,et al.  Adversarial Network Embedding , 2017, AAAI.

[5]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[6]  Siqi Shen,et al.  Predicting the implicit and the explicit video popularity in a User Generated Content site with enhanced social features , 2018, Comput. Networks.

[7]  Siqi Shen,et al.  User Donations in a User Generated Video System , 2019, WWW.

[8]  Zhiyuan Liu,et al.  CANE: Context-Aware Network Embedding for Relation Modeling , 2017, ACL.

[9]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[10]  Cheng Li,et al.  Two supervised learning approaches for name disambiguation in author citations , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[11]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[12]  Mohammad Al Hasan,et al.  Name Disambiguation in Anonymized Graphs using Network Embedding , 2017, CIKM.

[13]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[14]  Murat Dundar,et al.  Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams , 2016, CIKM.

[15]  Yong Tang,et al.  A Novel Approach for Author Name Disambiguation Using Ranking Confidence , 2017, DASFAA Workshops.

[16]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[17]  Shou-De Lin,et al.  Effective string processing and matching for author disambiguation , 2013, KDD Cup '13.

[18]  Madian Khabsa,et al.  Online Person Name Disambiguation with Constraints , 2015, JCDL.

[19]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20]  Jun Xu,et al.  A Network-embedding Based Method for Author Disambiguation , 2018, CIKM.

[21]  Philip S. Yu,et al.  ADANA: Active Name Disambiguation , 2011, 2011 IEEE 11th International Conference on Data Mining.

[22]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[23]  Qinghua Zheng,et al.  Dynamic author name disambiguation for growing digital libraries , 2015, Information Retrieval Journal.