Characteristics and Evolution of Citation Distance Based on LDA Method

The scientific research behavior of scholars is the core issue of scientific research. The research ideas and methods of complex networks provide a new perspective for the study of science. The scientific citation network and the scientist cooperation network are widely used to study the citation behavior of scholars and the dissemination of scientific ideas, and so far, some results have been obtained. However, due to the lack of information on the content of the article, the research based solely on the network topology has limitations and deficiencies. Combining the textual content analysis through LDA, this paper studies the distribution characteristics of content correlation between articles with citation relations and its evolution with time. It found that the distribution of citation distance has normal characteristics, but the reference distance is visible to be short. Authors have citation preferences for documents at a distance.

[1]  Seoung Bum Kim,et al.  Academic paper recommender system using multilevel simultaneous citation networks , 2018, Decis. Support Syst..

[2]  H. Stanley,et al.  The science of science: from the perspective of complex systems , 2017 .

[3]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Boleslaw K. Szymanski,et al.  Quantifying patterns of research-interest evolution , 2017, Nature Human Behaviour.

[6]  Linyuan Lu,et al.  Quantifying the influence of scientists and their publications: Distinguish prestige from popularity , 2011, ArXiv.

[7]  Menghui Li,et al.  Quantifying the influence of scientists and their publications: distinguishing between prestige and popularity , 2011, ArXiv.

[8]  Konrad Paul Kording,et al.  Future impact: Predicting scientific success , 2012, Nature.

[9]  Timothy Baldwin,et al.  Automatic keyphrase extraction from scientific articles , 2013, Lang. Resour. Evaluation.

[10]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[11]  Ying Ding,et al.  The Landscape of Causal Inference: Perspective From Citation Network Analysis , 2018, The American Statistician.

[12]  Yi-Cheng Zhang,et al.  Influence, originality and similarity in directed acyclic graphs , 2011, ArXiv.

[13]  Albert-László Barabási,et al.  Collective credit allocation in science , 2014, Proceedings of the National Academy of Sciences.

[14]  Thierry Poibeau,et al.  Automatic Text Summarization: Past, Present and Future , 2013, Multi-source, Multilingual Information Extraction and Summarization.

[15]  Santo Fortunato,et al.  Diffusion of scientific credits and the ranking of scientists , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Yoshiyuki Takeda,et al.  Detecting emerging research fronts based on topological measures in citation networks of scientific publications , 2008 .

[17]  Yang Li,et al.  Important institutions of interinstitutional scientific collaboration networks in materials science , 2018, Scientometrics.

[18]  Loet Leydesdorff,et al.  The Challenge of Scientometrics: The Development, Measurement, and Self-Organization of Scientific Communications , 2001 .