Detecting trends in academic research from a citation network using network representation learning

Several network features and information retrieval methods have been proposed to elucidate the structure of citation networks and to detect important nodes. However, it is difficult to retrieve information related to trends in an academic field and to detect cutting-edge areas from the citation network. In this paper, we propose a novel framework that detects the trend as the growth direction of a citation network using network representation learning(NRL). We presume that the linear growth of citation network in latent space obtained by NRL is the result of the iterative edge additional process of a citation network. On APS datasets and papers of some domains of the Web of Science, we confirm the existence of trends by observing that an academic field grows in a specific direction linearly in latent space. Next, we calculate each node’s degree of trend-following as an indicator called the intrinsic publication year (IPY). As a result, there is a correlation between the indicator and the number of future citations. Furthermore, a word frequently used in the abstracts of cutting-edge papers (high-IPY paper) is likely to be used often in future publications. These results confirm the validity of the detected trend for predicting citation network growth.

[1]  Niloy Ganguly,et al.  Towards a stratified learning approach to predict future citation counts , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[2]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[3]  Zhiyong Lu,et al.  Click-words: learning to predict document keywords from a user perspective , 2010, Bioinform..

[4]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[5]  K. Fujita,et al.  Detecting research fronts using different types of weighted citation networks , 2012, 2012 Proceedings of PICMET '12: Technology Management for Emerging Technologies.

[6]  David Pinto,et al.  Computer science research: the top 100 institutions in India and in the world , 2015, Scientometrics.

[7]  Tsutomu Miyasaka,et al.  Organometal halide perovskites as visible-light sensitizers for photovoltaic cells. , 2009, Journal of the American Chemical Society.

[8]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[9]  Naoki Shibata,et al.  Comparative study on methods of detecting research fronts using different types of citation , 2009, J. Assoc. Inf. Sci. Technol..

[10]  Dalibor Fiala,et al.  Network-based statistical comparison of citation topology of bibliographic databases , 2014, Scientific Reports.

[11]  J. Geelen ON HOW TO DRAW A GRAPH , 2012 .

[12]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[13]  Jie Tang,et al.  Citation count prediction: learning to estimate future citations for literature , 2011, CIKM '11.

[14]  E. Garfield The history and meaning of the journal impact factor. , 2006, JAMA.

[15]  Tim S. Evans,et al.  What is the dimension of citation space? , 2014, ArXiv.

[16]  Lawrence K. Saul,et al.  Modeling distances in large-scale networks by matrix factorization , 2004, IMC '04.

[17]  Rajagopalan Srinivasan,et al.  Sustainability trends in the process industries: A text mining-based analysis , 2014, Comput. Ind..

[18]  Ichiro Sakata,et al.  Detecting emerging research fronts in regenerative medicine by the citation network analysis of scientific publications , 2011 .

[19]  Ben R. Martin,et al.  The origins of the concept of ‘foresight’ in science and technology: An insider's perspective , 2010 .

[20]  Rcgm Roel Loonen,et al.  Science foresight using life-cycle analysis, text mining and clustering: A case study on natural ventilation , 2017, Technological Forecasting and Social Change.

[21]  Jure Leskovec Beyond nodes and edges: multiresolution algorithms for network data , 2016, NDA@SIGMOD.

[22]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[23]  Y. Kajikawa,et al.  Sustainability science: the changing landscape of sustainability research , 2014, Sustainability Science.

[24]  Chulhyun Kim,et al.  A systematic approach to new mobile service creation , 2008, Expert Syst. Appl..

[25]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[26]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[27]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[28]  Naoki Shibata,et al.  Topological analysis of citation networks to discover the future core articles , 2007, J. Assoc. Inf. Sci. Technol..

[29]  Ichiro Sakata,et al.  Detecting emerging research fronts in regenerative medicine by citation network analysis of scientific publications , 2009, PICMET '09 - 2009 Portland International Conference on Management of Engineering & Technology.

[30]  Brendan T. O'Connor,et al.  Predicting a Scientific Community’s Response to an Article , 2011, EMNLP.

[31]  Jonathan Adams,et al.  Early citation counts correlate with accumulated impact , 2005, Scientometrics.

[32]  Zheng Xie,et al.  A geometric graph model for citation networks of exponentially growing scientific papers , 2016 .

[33]  I. Miles The development of technology foresight: A review , 2010 .

[34]  Ali Cakmak,et al.  High Impact Academic Paper Prediction Using Temporal and Topological Features , 2014, CIKM.