A decade of Semantic Web research through the lenses of a mixed methods approach

The identification of research topics and trends is an important scientometric activity, as it can help guide the direction of future research. In the Semantic Web area, initially topic and trend detection was primarily performed through qualitative, top-down style approaches, that rely on expert knowledge. More recently, data-driven, bottom-up approaches have been proposed that offer a quantitative analysis of the evolution of a research domain. In this paper, we aim to provide a broader and more complete picture of Semantic Web topics and trends by adopting a mixed methods methodology, which allows for the combined use of both qualitative and quantitative approaches. Concretely, we build on a qualitative analysis of the main seminal papers, which adopt a top-down approach, and on quantitative results derived with three bottom-up data-driven approaches (Rexplore, Saffron, PoolParty), on a corpus of Semantic Web papers published between 2006 and 2015. In this process, we both use the latter for “fact-checking” on the former and also to derive key findings in relation to the strengths and weaknesses of top-down and bottom up approaches to research topic identification. Although we provide a detailed study on the past decade of Semantic Web research, the findings and the methodology are relevant not only for our community but beyond the area of the Semantic Web to other research fields as well.

[1]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[2]  Kazuhiko Kato,et al.  Extracting Topics From Weblogs Through Frequency Segments , 2006 .

[3]  Sheron L. Decker Detection of bursty and emerging trends towards identification of researchers at the early stage of trends , 2007 .

[4]  Enrico Motta,et al.  Exploring Scholarly Data with Rexplore , 2013, International Semantic Web Conference.

[5]  Georgeta Bordea,et al.  Domain adaptive extraction of topical hierarchies for Expertise Mining , 2013 .

[6]  Nancy L. Leech,et al.  A typology of mixed methods research designs , 2009 .

[7]  Enrico Motta,et al.  The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas , 2018, SEMWEB.

[8]  Huajun Chen,et al.  The Semantic Web , 2011, Lecture Notes in Computer Science.

[9]  Mikhail R. Kogalovsky,et al.  Semantic linkages in research information systems as a new data source for scientometric studies , 2014, Scientometrics.

[10]  Chaomei Chen,et al.  CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature , 2006, J. Assoc. Inf. Sci. Technol..

[11]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[12]  Marta Sabou,et al.  Propelling the Potential of Enterprise Linked Data in Austria. Roadmap and Report , 2016 .

[13]  Carl Lagoze,et al.  Detecting research topics via the correlation between graphs and texts , 2007, KDD '07.

[14]  Wolf-Tilo Balke,et al.  Demonstrating the semantic growbag: automatically creating topic facets for faceteddblp , 2007, JCDL '07.

[15]  Paul Buitelaar,et al.  Expertise Mining , 2010 .

[16]  James A. Hendler,et al.  A new look at the semantic web , 2016, Commun. ACM.

[17]  Francesco Osborne,et al.  The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly Articles , 2019, TPDL.

[18]  M. Callon,et al.  From translations to problematic networks: An introduction to co-word analysis , 1983 .

[19]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[20]  Xiaolong Wang,et al.  Online topic detection and tracking of financial news based on hierarchical clustering , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[21]  Concepción S. Wilson,et al.  The Literature of Bibliometrics, Scientometrics, and Informetrics , 2001, Scientometrics.

[22]  P. Buitelaar,et al.  Exploring Your Research : Sprinkling some Saffron on Semantic Web Dog Food , 2010 .

[23]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[24]  Jarno Hoekman,et al.  Spatial scientometrics: Towards a cumulative research program , 2009, J. Informetrics.

[25]  Satoshi Morinaga,et al.  Tracking dynamics of topic trends using a finite mixture model , 2004, KDD.

[26]  Frank Teuteberg,et al.  Scientometrics: How to perform a Big Data Trend Analysis with ScienceMiner , 2014, GI-Jahrestagung.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Enrico Motta,et al.  Klink-2: Integrating Multiple Web Sources to Generate Semantic Topic Networks , 2015, SEMWEB.

[29]  R. Doyle The American terrorist. , 2001, Scientific American.

[30]  Enrique Herrera-Viedma,et al.  SciMAT: A new science mapping analysis software tool , 2012, J. Assoc. Inf. Sci. Technol..

[31]  Ludo Waltman,et al.  Software survey: VOSviewer, a computer program for bibliometric mapping , 2009, Scientometrics.

[32]  Enrico Motta,et al.  Automatic Classification of Springer Nature Proceedings with Smart Topic Miner , 2016, SEMWEB.

[33]  Enrico Motta,et al.  Classifying Research Papers with the Computer Science Ontology , 2018, International Semantic Web Conference.

[34]  Krzysztof Janowicz,et al.  A Linked-Data-Driven and Semantically-Enabled Journal Portal for Scientometrics , 2013, SEMWEB.

[35]  J. M. Schultz,et al.  Topic Detection and Tracking using idf-Weighted Cosine Coefficient , 1999 .

[36]  Andreas Blumauer,et al.  PoolParty: SKOS Thesaurus Management Utilizing Linked Data , 2010, ESWC.

[37]  Lee Feigenbaum,et al.  The Semantic Web in action. , 2007, Scientific American.

[38]  Heiner Stuckenschmidt,et al.  15 Years of Semantic Web: An Incomplete Survey , 2016, KI - Künstliche Intelligenz.

[39]  Henry Muccini,et al.  Reducing the Effort for Systematic Reviews in Software Engineering , 2019, Data Sci..

[40]  Francisco Herrera,et al.  Science mapping software tools: Review, analysis, and cooperative study among tools , 2011, J. Assoc. Inf. Sci. Technol..

[41]  Jean-Philippe Cointet,et al.  Phylomemetic Patterns in Science Evolution—The Rise and Fall of Scientific Fields , 2013, PloS one.