Disambiguating Authors in Academic Search Engines

Author name ambiguity is a common problem in current academic search engines. It presents a great challenge: first, it’s not convenient for researchers to effectively access the academic publications; second, the author publication profile cannot be induced, thus further meaningful semantic analysis for the author cannot be conducted. In this paper, we present a coauthorship based model to disambiguate the authors, and apply this model into a large academic search engine – Scholat Search, which proves to be effective and efficient.

[1]  Wei Xu,et al.  A hierarchical naive Bayes mixture model for name disambiguation in author citations , 2005, SAC '05.

[2]  Juan-Zi Li,et al.  Name Disambiguation Using Atomic Clusters , 2008, 2008 The Ninth International Conference on Web-Age Information Management.

[3]  Won-Kyung Sung,et al.  On co-authorship for author disambiguation , 2009, Inf. Process. Manag..

[4]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[5]  Hui Han,et al.  Name disambiguation in author citations using a K-way spectral clustering method , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[6]  Jianyong Wang,et al.  On Graph-Based Name Disambiguation , 2011, JDIQ.

[7]  Juan-Zi Li,et al.  A Unified Probabilistic Framework for Name Disambiguation in Digital Library , 2012, IEEE Transactions on Knowledge and Data Engineering.

[8]  Neil R. Smalheiser,et al.  Author name disambiguation in MEDLINE , 2009, TKDD.

[9]  C. Lee Giles,et al.  Two supervised learning approaches for name disambiguation in author citations , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..