Beyond source code: The importance of other artifacts in software development (a case study)

Current software systems contain increasingly more elements that have not usually been considered in software engineering research and studies. Source artifacts, understood as the source components needed to obtain a binary, ready to use version of a program, comprise in many systems more than just the elements written in a programming language (source code). Especially when we move apart from systems-programming and enter the realm of end-user applications, we find files for documentation, interface specifications, internationalization and localization modules and multimedia data. All of them are source artifacts in the sense that developers work directly with them, and that applications are built automatically using them as input. This paper discusses the differences and relationships between source code (usually written in a programming language) and these other files, by analyzing the KDE software versioning repository (with about 6,800,000 commits and 450,000 files). A comprehensive study of those files, and their evolution in time, is performed, looking for patterns and trying to infer from them the related behaviors of developers with different profiles, from where we conclude that studying those 'other' source artifacts can provide a great deal of insight on a software system.

[1]  J. Herbsleb,et al.  Two case studies of open source software development: Apache and Mozilla , 2002, TSEM.

[2]  Audris Mockus,et al.  Identifying reasons for software changes using historic databases , 2000, Proceedings 2000 International Conference on Software Maintenance.

[3]  Daniel M. Germán,et al.  Using software trails to reconstruct the evolution of software , 2004, J. Softw. Maintenance Res. Pract..

[4]  Jesús M. González-Barahona,et al.  Evolution and growth in large libre software projects , 2005, Eighth International Workshop on Principles of Software Evolution (IWPSE'05).

[5]  Meir M. Lehman,et al.  Program evolution: processes of software change , 1985 .

[6]  Dewayne E. Perry,et al.  Metrics and laws of software evolution-the nineties view , 1997, Proceedings Fourth International Software Metrics Symposium.

[7]  D HerbslebJames,et al.  Two case studies of open source software development , 2002 .

[8]  Daniel German,et al.  Mining CVS repositories, the softChange experience , 2004, MSR.

[9]  Daniel M. Germán,et al.  An empirical study of fine-grained software modifications , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[10]  Alfonso Fuggetta,et al.  Open source software - an evaluation , 2003, J. Syst. Softw..

[11]  Gregorio Robles,et al.  An Empirical Approach to Software Archaeology , 2005 .

[12]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[13]  Audris Mockus,et al.  Does Code Decay? Assessing the Evidence from Change Management Data , 2001, IEEE Trans. Software Eng..

[14]  Wladyslaw M. Turski Reference Model for Smooth Growth of Software Systems(003)5402022 , 1996, IEEE Transactions on Software Engineering.

[15]  Tim Menzies,et al.  Text is Software Too , 2004, MSR.

[16]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.

[17]  M. E. Conway HOW DO COMMITTEES INVENT , 1967 .

[18]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[19]  Michael W. Godfrey,et al.  Evolution in open source software: a case study , 2000, Proceedings 2000 International Conference on Software Maintenance.

[20]  Thomas Zimmermann,et al.  Preprocessing CVS Data for Fine-Grained Analysis , 2004, MSR.

[21]  Daniel M. Germán,et al.  The GNOME project: a case study of open source, global software development , 2003, Softw. Process. Improv. Pract..

[22]  Stéphane Ducasse,et al.  How developers drive software evolution , 2005, Eighth International Workshop on Principles of Software Evolution (IWPSE'05).

[23]  Barton C. Massey,et al.  Longitudinal analysis of long-timescale open source repository data , 2005, PROMISE@ICSE.

[24]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[25]  Meir M. Lehman,et al.  An approach to modelling long-term growth trends in software systems , 2001, Proceedings IEEE International Conference on Software Maintenance. ICSM 2001.

[26]  Gregorio Robles,et al.  Remote analysis and measurement of libre software systems by means of the CVSAnalY tool , 2004, ICSE 2004.

[27]  Jesus M. Gonzalez-Barahona Community structure of modules in the Apache project , 2004, ICSE 2004.

[28]  Daniel M. German Using software trails to reconstruct the evolution of software: Research Articles , 2004 .

[29]  Harald C. Gall,et al.  Software evolution observations based on product release history , 1997, 1997 Proceedings International Conference on Software Maintenance.

[30]  Audris Mockus,et al.  Inferring change effort from configuration management databases , 1998, Proceedings Fifth International Software Metrics Symposium. Metrics (Cat. No.98TB100262).

[31]  Dave Thomas,et al.  Software Archaeology , 2002, IEEE Softw..

[32]  James M. Bieman,et al.  The FreeBSD project: a replication case study of open source development , 2005, IEEE Transactions on Software Engineering.

[33]  Jesús M. González-Barahona,et al.  Applying Social Network Analysis to the Information in CVS Repositories , 2004, MSR.

[34]  Audris Mockus,et al.  Using Version Control Data to Evaluate the Impact of Software Tools: A Case Study of the Version Editor , 2002, IEEE Trans. Software Eng..