Big Data Research Landscape: A Meta-Analysis and Literature Review from 2009 to 2018

Big data is a relatively new and lively field of research and practice. Researchers investigating a new field of research are interested in themes, trends, and gaps that demonstrate the field in order to make informed and accurate choices. Understanding up to date trends and potential gaps in a research field is only possible through a detailed analysis of the domain-specific literature. Since scientific literature has an exponential growth, it is not easy to manually identify such trends. In this context, probabilistic topic modeling is an effective approach which has recently attracted considerable attention in semantic analysis of large-scale textual collections. In this study, a semantic content analysis based on topic modeling was conducted on big data literature from 2009 to 2018 in order to discover big data research trends and themes.

[1]  Muhammet Berigel,et al.  Real-Time Processing of Big Data Streams: Lifecycle, Tools, Tasks, and Challenges , 2018, 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).

[2]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[3]  Nergiz Ercil Cagiltay,et al.  Big Data Software Engineering: Analysis of Knowledge Domains and Skill Sets Using LDA-Based Topic Modeling , 2019, IEEE Access.

[4]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[5]  D. Edwards Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .

[6]  Fatih Gurcan Extraction of core competencies for Big Data: implications for Competency-Based Engineering Education , 2019 .

[7]  Muhammad Younas,et al.  Emerging trends and technologies in big data processing , 2015, Concurr. Comput. Pract. Exp..

[8]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[9]  Fatih Gürcan,et al.  Major Research Topics in Big Data: A Literature Analysis from 2013 to 2017 Using Probabilistic Topic Models , 2018, 2018 International Conference on Artificial Intelligence and Data Processing (IDAP).

[10]  Murat Can Ganiz,et al.  Analysis of preprocessing methods on classification of Turkish texts , 2011, 2011 International Symposium on Innovations in Intelligent Systems and Applications.

[11]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[12]  Avita Katal,et al.  Big data: Issues, challenges, tools and Good practices , 2013, 2013 Sixth International Conference on Contemporary Computing (IC3).

[13]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[15]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[16]  Michael Minelli,et al.  Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses , 2012 .

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Andrew T. Karl,et al.  A practical guide to text mining with topic extraction , 2015 .

[19]  Jan vom Brocke,et al.  Comparing Business Intelligence and Big Data Skills , 2014, Business & Information Systems Engineering.