Extracting scientific trends by mining topics from Call for Papers

© 2019, Emerald Publishing Limited. Purpose: The purpose of this paper is to present a novel approach for mining scientific trends using topics from Call for Papers (CFP). The work contributes a valuable input for researchers, academics, funding institutes and research administration departments by sharing the trends to set directions of research path. Design/methodology/approach: The authors procure an innovative CFP data set to analyse scientific evolution and prestige of conferences that set scientific trends using scientific publications indexed in DBLP. Using the Field of Research code 804 from Australian Research Council, the authors identify 146 conferences (from 2006 to 2015) into different thematic areas by matching the terms extracted from publication titles with the Association for Computing Machinery Computing Classification System. Furthermore, the authors enrich the vocabulary of terms from the WordNet dictionary and Growbag data set. To measure the significance of terms, the authors adopt the following weighting schemas: probabilistic, gram, relative, accumulative and hierarchal. Findings: The results indicate the rise of “big data analytics” from CFP topics in the last few years. Whereas the topics related to “privacy and security” show an exponential increase, the topics related to “semantic web” show a downfall in recent years. While analysing publication output in DBLP that matches CFP indexed in ERA Core A* to C rank conference, the authors identified that A* and A tier conferences not merely set publication trends, since B or C tier conferences target similar CFP. Originality/value: Overall, the analyses presented in this research are prolific for the scientific community and research administrators to study research trends and better data management of digital libraries pertaining to the scientific literature.

[1]  Peter Haddawy,et al.  A bibliometric study of the world’s research activity in sustainable development and its sub-areas using scientific literature , 2014, Scientometrics.

[2]  Sophia Ananiadou,et al.  Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature , 2011, BMC Bioinformatics.

[3]  Ahmed Patel,et al.  Evaluation of Cheating Detection Methods in Academic Writings , 2011, Libr. Hi Tech.

[4]  Yalou Huang,et al.  Using Hashtag Graph-Based Topic Model to Connect Semantically-Related Words Without Co-Occurrence in Microblogs , 2016, IEEE Transactions on Knowledge and Data Engineering.

[5]  Tiantian Wang,et al.  THC-DAT: a document analysis tool based on topic hierarchy and context information , 2016, Libr. Hi Tech.

[6]  H. Small,et al.  Identifying emerging topics in science and technology , 2014 .

[7]  Chengzhi Zhang,et al.  Emotion evolutions of sub-topics about popular events on microblogs , 2017, Electron. Libr..

[8]  Saeed-Ul Hassan,et al.  A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis , 2018, Scientometrics.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Madian Khabsa,et al.  Digital commons , 2020, Internet Policy Rev..

[11]  Jui-Feng Yeh,et al.  Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation , 2016, Neurocomputing.

[12]  Sophia Ananiadou,et al.  Enriching news events with meta-knowledge information , 2016, Language Resources and Evaluation.

[13]  Jan Nolin,et al.  Semantic web, ubiquitous computing, or internet of things? A macro-analysis of scholarly publications , 2015, J. Documentation.

[14]  Xiaojun Li,et al.  Discovering research topics from library electronic references using latent Dirichlet allocation , 2018, Libr. Hi Tech.

[15]  Sophia Ananiadou,et al.  Identification of research hypotheses and new knowledge from scientific literature , 2018, BMC Medical Informatics and Decision Making.

[16]  Sophia Ananiadou,et al.  Enhancing Search: Events and Their Discourse Context , 2013, CICLing.

[17]  Sophia Ananiadou,et al.  Negated bio-events: analysis and identification , 2013, BMC Bioinformatics.

[18]  Saeed-Ul Hassan,et al.  Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications , 2019, Scientometrics.

[19]  John D. Lafferty,et al.  A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval , 2017, SIGF.

[20]  Nikos Mamoulis,et al.  A Topic-based Reviewer Assignment System , 2015, Proc. VLDB Endow..

[21]  Sophia Ananiadou,et al.  Facilitating the Analysis of Discourse Phenomena in an Interoperable NLP Platform , 2013, CICLing.

[22]  Maha Al-Yahya Stylometric analysis of classical Arabic texts for genre detection , 2018, Electron. Libr..

[23]  Peter Haddawy,et al.  Analyzing knowledge flows of scientific literature through semantic links: a case study in the field of energy , 2015, Scientometrics.