A systematic mapping study on mining software repositories

Background: Software repositories provide large amount of data encompassing software changes throughout its evolution. Those repositories can be effectively used to extract and analyze pertinent information and derive conclusions related to the software history or its current snapshot. Objective: This work aims to investigate recent studies on Mining Software Repositories (MSR) approaches collecting evidences about software analysis goals (purpose, focus, and object of analysis), data sources, evaluation methods, tools, and how the area is evolving. Method: A systematic mapping study was performed to identify and analyze research on mining software repositories by analyzing five editions of Working Conference on Mining Software Repositories -- the main conference on this area. Results: MSR approaches have been used for many different goals, mainly for comprehension of defects, analysis of the contribution and behavior of developers, and software evolution comprehension. Besides, some gaps were identified with respect to their goals, focus, and data source type (e.g. lack of usage of comments to identify smells, refactoring, and issues of software quality). Regarding the evaluation method, our analysis pointed out to an extensive usage of some types of empirical evaluation. Conclusion: Studies of the MSR have focused on different goals, however there are still many research opportunities to be explored and issues associated with MSR that should be considered.

[1]  Bram Adams,et al.  Do developers feel emotions? an exploratory analysis of emotions in software artifacts , 2014, MSR 2014.

[2]  Audris Mockus,et al.  Towards building a universal defect prediction model , 2014, MSR 2014.

[3]  Stefano Zacchiroli,et al.  The Ultimate Debian Database: Consolidating bazaar metadata for Quality Assurance and data mining , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[4]  Lori L. Pollock,et al.  Automatically mining software-based, semantically-similar words from comment-code mappings , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[5]  Ali Mesbah,et al.  Mining questions asked by web developers , 2014, MSR 2014.

[6]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[7]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[8]  Kai Petersen,et al.  Systematic Mapping Studies in Software Engineering , 2008, EASE.

[9]  Joseph Gil,et al.  An empirical investigation of changes in some software properties over time , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[10]  Pearl Brereton,et al.  Systematic literature reviews in software engineering - A systematic literature review , 2009, Inf. Softw. Technol..

[11]  Silvia Mara Abrahão,et al.  Usability evaluation methods for the web: A systematic mapping study , 2011, Inf. Softw. Technol..

[12]  Krzysztof Czarnecki,et al.  Towards improving bug tracking systems with game mechanisms , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[13]  Jonathan I. Maletic,et al.  Journal of Software Maintenance and Evolution: Research and Practice Survey a Survey and Taxonomy of Approaches for Mining Software Repositories in the Context of Software Evolution , 2022 .

[14]  Emad Shihab,et al.  Characterizing and predicting blocking bugs in open source projects , 2014, MSR 2014.

[15]  Avinash C. Kak,et al.  Retrieval from software libraries for bug localization: a comparative study of generic and composite text models , 2011, MSR '11.

[16]  Gabriele Bavota,et al.  Mining StackOverflow to turn the IDE into a self-confident programming prompter , 2014, MSR 2014.

[17]  Michel R. V. Chaudron,et al.  Assessing UML design metrics for predicting fault-prone classes in a Java system , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[18]  Premkumar T. Devanbu,et al.  Asking for (and about) permissions used by Android apps , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[19]  Amela Karahasanovic,et al.  A survey of controlled experiments in software engineering , 2005, IEEE Transactions on Software Engineering.

[20]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[21]  Manoel G. Mendonça,et al.  Software evolution visualization: A systematic mapping study , 2013, Inf. Softw. Technol..

[22]  Gregorio Robles,et al.  Replicating MSR: A study of the potential replicability of papers published in the Mining Software Repositories proceedings , 2010, 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010).

[23]  Michael W. Godfrey,et al.  The MSR Cookbook: Mining a decade of research , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[24]  Jinqiu Yang,et al.  Inferring semantically related words from software context , 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR).

[25]  Harald C. Gall,et al.  A study of language usage evolution in open source software , 2011, MSR '11.

[26]  Forrest Shull,et al.  Building Knowledge through Families of Experiments , 1999, IEEE Trans. Software Eng..

[27]  David Lo,et al.  Tag recommendation in software information sites , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[28]  Claes Wohlin,et al.  On the reliability of mapping studies in software engineering , 2013, J. Syst. Softw..

[29]  Marcus Ciolkowski,et al.  Conducting on-line surveys in software engineering , 2003, 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings..

[30]  A.E. Hassan,et al.  The road ahead for Mining Software Repositories , 2008, 2008 Frontiers of Software Maintenance.

[31]  Gabriele Bavota,et al.  Mining energy-greedy API usage patterns in Android apps: an empirical study , 2014, MSR 2014.

[32]  Gustavo Pinto,et al.  Mining questions about software energy consumption , 2014, MSR 2014.

[33]  Sunita Chulani,et al.  Implementing quality metrics and goals at the corporate level , 2011, MSR '11.

[34]  Chanchal Kumar Roy,et al.  Understanding the evolution of Type-3 clones: An exploratory study , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[35]  Bernd Brügge,et al.  Bug report assignee recommendation using activity profiles , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[36]  Serge Demeyer,et al.  Happy Birthday! A trend analysis on past MSR papers , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[37]  Elmar Jürgens,et al.  Incremental origin analysis of source code files , 2014, MSR 2014.