Understanding the topic evolution in a scientific domain: An exploratory study for the field of information retrieval

Understanding topic evolution in a scientific domain is essential for capturing key domain developments and facilitating knowledge transfer within and across domains. Using a data set on information retrieval (IR) publications, this paper examines how research topics evolve by analyzing the topic trends, evolving dynamics, and semantic word shifts in the IR domain. Knowledge transfer between topics and the developing status of the major topics have been recognized, which are represented by the merging and splitting of local topics in different time periods. Results show that the evolution of a major topic usually follows a pattern from adjusting status to mature status, and sometimes with re-adjusting status in between the evolving process. Knowledge transfer happens both within a topic and among topics. Word migration via topic channels has been defined, and three migration types (non-migration, dual-migration, and multi-migration) are distinguished to facilitate better understanding of the topic evolution.

[1]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[2]  Marco Baroni,et al.  A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. , 2011, GEMS.

[3]  Winfred P. Lehmann,et al.  Historical Linguistics: An Introduction , 1962 .

[4]  Jian Xu,et al.  Author Credit for Transdisciplinary Collaboration , 2015, PloS one.

[5]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[6]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[7]  Eugene Agichtein,et al.  TM-LDA: efficient online modeling of latent topic transitions in social media , 2012, KDD.

[8]  M. de Rijke,et al.  Ad Hoc Monitoring of Vocabulary Shifts over Time , 2015, CIKM.

[9]  John W. Lounsbury,et al.  An analysis of topic areas and topic trends in theCommunity Mental Health Journal from 1965 through 1977 , 1979, Community Mental Health Journal.

[10]  Cassidy R. Sugimoto,et al.  Topics in dynamic research communities: An exploratory study for the field of information retrieval , 2012, J. Informetrics.

[11]  P. Iles,et al.  HRM and Knowledge Migration Across Cultures: Issues, Limitations, and Mauritian Specificities , 2004 .

[12]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[13]  D. Wijaya,et al.  Understanding semantic change of words over centuries , 2011, DETECT '11.

[14]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[15]  Jian Pei,et al.  Detecting topic evolution in scientific literature: how can citations help? , 2009, CIKM.

[16]  Ying Ding,et al.  Topic-based PageRank on author cocitation networks , 2011, J. Assoc. Inf. Sci. Technol..

[17]  K. Börner,et al.  Mapping topics and topic bursts in PNAS , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[19]  Vladimír Baláž,et al.  International migration and knowledge , 2008 .

[20]  Massih-Reza Amini,et al.  Streaming-LDA: A Copula-based Approach to Modeling Topic Dependencies in Document Streams , 2016, KDD.

[21]  Jimeng Sun,et al.  Dynamic Mixture Models for Multiple Time-Series , 2007, IJCAI.

[22]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[23]  Chaomei Chen,et al.  Visualizing knowledge domains , 2005, Annu. Rev. Inf. Sci. Technol..

[24]  Chaomei Chen,et al.  Web site design with the patron in mind: A step-by-step guide for libraries , 2006 .

[25]  Xiang Ji,et al.  Topic evolution and social interactions: how authors effect research , 2006, CIKM '06.

[26]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Carl Lagoze,et al.  The web of topics: discovering the topology of topic evolution in a corpus , 2011, WWW.

[29]  Slav Petrov,et al.  Temporal Analysis of Language through Neural Language Models , 2014, LTCSS@ACL.

[30]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[31]  Ying Ding,et al.  Data-driven Discovery: A New Era of Exploiting the Literature and Data , 2016, J. Data Inf. Sci..

[32]  Myra Spiliopoulou,et al.  Topic Evolution in a Stream of Documents , 2009, SDM.