Identifying the Challenges of the Blockchain Community from StackExchange Topics and Trends

Software developers around the globe have shown tremendous interests in blockchain with more than seven thousand active blockchain software (BCS) projects on Github. Yet, little research has focused on understanding the challenges encountered by the developers of those projects as well as its' users. Therefore, the objective of this study is to better understand the primary areas of challenges encountered by the BCS community. Using a Latent Dirichlet Allocation based topic modeling, we identify discussion topics from the two Blockchain related StackExchange sites. We manually investigated the posts belonging to each topic to understand challenges encountered by the developers. The results of our study revealed that while the ratios of posts on BCS development are increasing, the ratios of posts on mining cryptocurrencies are decreasing. Due to the scarcity of expert blockchain developers, posts on BCS development are more likely to go either answered or encounter more delays than posts on other topics. Based on our findings, we recommend project maintainers to spend efforts to improve documentations on BCS development as the community lacks supporting materials on that area the most.

[1]  Xinli Yang,et al.  What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts , 2016, Journal of Computer Science and Technology.

[2]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[3]  Zahra Shakeri Hossein Abad,et al.  What are Practitioners Asking about Requirements Engineering? An Exploratory Analysis of Social Q&A Sites , 2016, 2016 IEEE 24th International Requirements Engineering Conference Workshops (REW).

[4]  Arilo Claudio Dias-Neto,et al.  What are Software Engineers asking about Android Testing on Stack Overflow? , 2017, SBES.

[5]  G. Yule Why do we Sometimes get Nonsense-Correlations between Time-Series?--A Study in Sampling and the Nature of Time-Series , 1926 .

[6]  P. R. Shearer,et al.  Quantitative Forecasting Methods , 1990 .

[7]  Jeffrey C. Carver,et al.  Building reputation in StackOverflow: An empirical investigation , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[8]  Emad Shihab,et al.  What are mobile developers asking about? A large scale study using stack overflow , 2016, Empirical Software Engineering.

[9]  Scott Grant,et al.  Estimating the Optimal Number of Latent Concepts in Source Code Analysis , 2010, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation.

[10]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[11]  David Buttler,et al.  Exploring Topic Coherence over Many Models and Many Topics , 2012, EMNLP.

[12]  Tim Menzies,et al.  What is Wrong with Topic Modeling? (and How to Fix it Using Search-based SE) , 2016, ArXiv.

[13]  Anindya Iqbal,et al.  Understanding the motivations, challenges and needs of Blockchain software developers: a survey , 2018, Empirical Software Engineering.

[14]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[15]  D. Dickey,et al.  Testing for unit roots in autoregressive-moving average models of unknown order , 1984 .

[16]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..