How do developers discuss and support new programming languages in technical Q&A site? An empirical study of Go, Swift, and Rust in Stack Overflow

Context: New programming languages (e.g., Swift, Go, Rust, etc.) are being introduced to provide a better opportunity for the developers to make software development robust and easy. At the early stage, a programming language is likely to have resource constraints that encourage the developers to seek help frequently from experienced peers active in Question–Answering (QA) sites such as Stack Overflow (SO). Objective: In this study, we have formally studied the discussions on three popular new languages introduced after the inception of SO (2008) and match those with the relevant activities in GitHub whenever appropriate. For that purpose, we have mined 4,17,82,536 questions and answers from SO and 7,846 issue information along with 6,60,965 repository information from Github. Initially, the development of new languages is relatively slow compared to mature languages (e.g., C, C++, Java). The expected outcome of this study is to reveal the difficulties and challenges faced by the developers working with these languages so that appropriate measures can be taken to expedite the generation of relevant resources. Method: We have used the Latent Dirichlet Allocation (LDA) method on SO’s questions and answers to identify different topics of new languages. We have extracted several features of the answer pattern of the new languages from SO (e.g., time to get an accepted answer, time to get an answer, etc.) to study their characteristics. These attributes were used to identify difficult topics. We explored the background of developers who are contributing to these languages. We have created a model by combining Stack Overflow data and issues, repository, user data of Github. Finally, we have used that model to identify factors that affect language evolution. Results: The major findings of the study are: (i) migration, data and data structure are generally the difficult topics of new languages, (ii) the time when adequate resources are expected to be available vary from language to language, (iii) the unanswered question ratio increases regardless of the age of the language, and (iv) there is a relationship between developers’ activity pattern and the growth of a language. Conclusion: We believe that the outcome of our study is likely to help the owner/sponsor of these languages to design better features and documentation. It will also help the software developers or students to prepare themselves to work on these languages in an informed way.

[1]  Beijun Shen,et al.  Mining Developer Behavior Across GitHub and StackOverflow , 2017, SEKE.

[2]  Mehdi Bagherzadeh,et al.  What do concurrency developers ask about?: a large-scale study using stack overflow , 2018, ESEM.

[3]  Hongyue WANG,et al.  Log-transformation and its implications for data analysis , 2014, Shanghai archives of psychiatry.

[4]  Eleni Stroulia,et al.  Involvement, contribution and influence in GitHub and stack overflow , 2014, CASCON.

[5]  Anita Sarma,et al.  Perceptions of answer quality in an online technical question and answer forum , 2014, CHASE.

[6]  John Maloney,et al.  The Scratch Programming Language and Environment , 2010, TOCE.

[7]  Premkumar T. Devanbu,et al.  How social Q&A sites are changing knowledge sharing in open source software communities , 2014, CSCW.

[8]  Raffi Khatchadourian,et al.  Going big: a large-scale study on what big data developers ask , 2019, ESEC/SIGSOFT FSE.

[9]  Felipe Ebert,et al.  An Empirical Study on the Usage of the Swift Programming Language , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[10]  E. Steyerberg,et al.  [Regression modeling strategies]. , 2011, Revista espanola de cardiologia.

[11]  Anindya Iqbal,et al.  Empirical Analysis of the Growth and Challenges of New Programming Languages , 2019, 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC).

[12]  Jacques Klein,et al.  Got issues? Who cares about it? A large scale investigation of issue trackers from GitHub , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Drew Payne Jumping through hoops. , 2007, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[15]  Ahmed E. Hassan,et al.  What are developers talking about? An analysis of topics and trends in Stack Overflow , 2014, Empirical Software Engineering.

[16]  Marco R. Spruit,et al.  Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation , 2017, 2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[17]  Ahmed E. Hassan,et al.  What Do Programmers Discuss About Blockchain? A Case Study on the Use of Balanced LDA and the Reference Architecture of a Domain to Capture Online Discussions About Blockchain Platforms Across Stack Exchange Communities , 2019, IEEE Transactions on Software Engineering.

[18]  B. Efron How Biased is the Apparent Error Rate of a Prediction Rule , 1986 .

[19]  Shane McIntosh,et al.  An empirical study of the impact of modern code review practices on software quality , 2015, Empirical Software Engineering.

[20]  Mohammed Al-Shalalfa,et al.  Efficient Periodicity Mining in Time Series Databases Using Suffix Trees , 2011, IEEE Transactions on Knowledge and Data Engineering.

[21]  Jonathan Murray,et al.  Cloud Computing: From Scarcity to Abundance , 2015 .

[22]  Emad Shihab,et al.  What are mobile developers asking about? A large scale study using stack overflow , 2016, Empirical Software Engineering.

[23]  Martin Pinzger,et al.  Towards a weighted voting system for Q&A sites , 2013 .

[24]  Ali Mesbah,et al.  Mining questions asked by web developers , 2014, MSR 2014.

[25]  Leman Akoglu,et al.  Min(e)d your tags: Analysis of Question response time in StackOverflow , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[26]  Ping Wang,et al.  Which Size Matters? Effects of Crowd Size on Solution Quality in Big Data Q&A Communities , 2017, ICWSM.

[27]  Alexander Serebrenik,et al.  StackOverflow and GitHub: Associations between Software Development and Crowdsourced Knowledge , 2013, 2013 International Conference on Social Computing.

[28]  Mária Bieliková,et al.  Why is Stack Overflow Failing? Preserving Sustainability in Community Question Answering , 2016, IEEE Software.