An overview of the history of Science of Science in China based on the use of bibliographic and citation data: a new method of analysis based on clustering with feature maximization and contrast graphs

In the first part of this paper, we shall discuss the historical context of Science of Science both in China and at world level. In the second part, we use the unsupervised combination of GNG clustering with feature maximization metrics and associated contrast graphs to present an analysis of the contents of selected academic journal papers in Science of Science in China and the construction of an overall map of the research topics’ structure during the last 40 years. Furthermore, we highlight how the topics have evolved through analysis of publication dates and also use author information to clarify the topics’ content. The results obtained have been reviewed and approved by 3 leading experts in this field and interestingly show that Chinese Science of Science has gradually become mature in the last 40 years, evolving from the general nature of the discipline itself to related disciplines and their potential interactions, from qualitative analysis to quantitative and visual analysis, and from general research on the social function of science to its more specific economic function and strategic function studies. Consequently, the proposed novel method can be used without supervision, parameters and help from any external knowledge to obtain very clear and precise insights about the development of a scientific domain. The output of the topic extraction part of the method (clustering + feature maximization) is finally compared with the output of the well-known LDA approach by experts in the domain which serves to highlight the very clear superiority of the proposed approach.

[1]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[2]  Stephen G. Kobourov,et al.  Spring Embedders and Force Directed Graph Drawing Algorithms , 2012, ArXiv.

[3]  Chen Yue The rise of mapping knowledge domain , 2005 .

[4]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[5]  Jean-Charles Lamirel,et al.  Optimizing text classification through efficient feature selection based on quality metric , 2014, Journal of Intelligent Information Systems.

[6]  A. Barabasi,et al.  Quantifying the evolution of individual scientific impact , 2016, Science.

[7]  Jean-Charles Lamirel,et al.  Analysis of evolutions and interactions between science fields: the cooperation between feature selection and graph representation , 2013 .

[8]  Henry Etzkowitz,et al.  Universities and the global knowledge economy , 1997 .

[9]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[10]  R. BRIGHTMAN,et al.  The Social Function of Science , 1939, Nature.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Albert-László Barabási,et al.  Collective credit allocation in science , 2014, Proceedings of the National Academy of Sciences.

[13]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[14]  Boris M. Hessen,et al.  The social and economic roots of Newton's 'Principia' , 1972 .

[15]  Carl T. Bergstrom,et al.  The Science of Science , 2018, Science.

[16]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[17]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Xiaolong Jin,et al.  Exploring social influence via posterior effect of word-of-mouth recommendations , 2012, WSDM '12.

[19]  Jean-Charles Lamirel,et al.  Feature-based cluster validation for high-dimensional data , 2008 .

[20]  H. Stanley,et al.  The science of science: from the perspective of complex systems , 2017 .

[21]  Jean-Charles Lamirel,et al.  New efficient clustering quality indexes , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[22]  Jean-Charles Lamirel,et al.  Variations to incremental growing neural gas algorithm based on label maximization , 2011, The 2011 International Joint Conference on Neural Networks.

[23]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[25]  Loet Leydesdorff,et al.  A Triple Helix of University—Industry—Government Relations , 1998, Scientometrics.

[26]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .