A knowledge discovery in community contributions of big data technologies

The increasing variety of big data technologies in open source communities is challenging organizations to generate value from those advancements. The technology landscape is missing an overall perspective that clarifies the fragmented understanding of technologies, unpredictable lifecycles, and the unknown adoption for organizations to enable their business with useful technologies. More than one million contributions of features, bugs, and changes were pushed on public available code repositories to develop big data technologies with hidden understanding of the underlying data basis. Using this source could help to identify insights about technological domains as well as their adoption process of contributors to new uprising big data technologies. A knowledge discovery process provided the potential to analyze 269 big data technologies regarding their contribution behavior of over 21,000 contributors. As a result, investigations show an ecosystem of structuring big data technologies based on dynamic contributor networks that have implications on organizations adoption.

[1]  Benoit Baudry,et al.  "May the fork be with you": novel metrics to analyze collaboration on GitHub , 2014, WETSoM 2014.

[2]  D. P. Acharjya,et al.  A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools , 2016 .

[3]  Melnned M. Kantardzic Big Data Analytics , 2013, Lecture Notes in Computer Science.

[4]  Yoshitaka Kuwata,et al.  A Study on Growth Model of OSS Projects to estimate the stage of lifecycle , 2015, KES.

[5]  Helmut Krcmar,et al.  Big Data , 2014, Wirtschaftsinf..

[6]  Joel West,et al.  Contrasting Community Building in Sponsored and Community Founded Open Source Projects , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[7]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[8]  Robert Spence,et al.  Information Visualization: Design for Interaction (2nd Edition) , 2006 .

[9]  Ben Shneiderman,et al.  The eyes have it: a task by data type taxonomy for information visualizations , 1996, Proceedings 1996 IEEE Symposium on Visual Languages.

[10]  Daniela E. Damian,et al.  The promises and perils of mining GitHub , 2009, MSR 2014.

[11]  Veda C. Storey,et al.  Business Intelligence and Analytics: From Big Data to Big Impact , 2012, MIS Q..

[12]  Carsten Felden,et al.  Big Data - A State-of-the-Art , 2012, AMCIS.

[13]  G. Krogh Open-Source Software Development , 2003 .

[14]  Dong-Hee Shin,et al.  Demystifying big data: Anatomy of big data developmental process , 2016 .

[15]  James D. Herbsleb,et al.  Let's talk about it: evaluating contributions through discussion in GitHub , 2014, SIGSOFT FSE.

[16]  Joel West,et al.  How open is open enough?: Melding proprietary and open source platform strategies , 2003 .