Macro-level software evolution: a case study of a large software compilation

Software evolution studies have traditionally focused on individual products. In this study we scale up the idea of software evolution by considering software compilations composed of a large quantity of independently developed products, engineered to work together. With the success of libre (free, open source) software, these compilations have become common in the form of ‘software distributions’, which group hundreds or thousands of software applications and libraries into an integrated system. We have performed an exploratory case study on one of them, Debian GNU/Linux, finding some significant results. First, Debian has been doubling in size every 2 years, totalling about 300 million lines of code as of 2007. Second, the mean size of packages has remained stable over time. Third, the number of dependencies between packages has been growing quickly. Finally, while C is still by far the most commonly used programming language for applications, use of the C++, Java, and Python languages have all significantly increased. The study helps not only to understand the evolution of Debian, but also yields insights into the evolution of mature libre software systems in general.

[1]  Sorin Lerner,et al.  OPIUM: Optimal Package Install/Uninstall Manager , 2007, 29th International Conference on Software Engineering (ICSE'07).

[2]  C. Doyle-Jones From the Editorial Team , 2010 .

[3]  Walter F. Tichy,et al.  Proceedings 25th International Conference on Software Engineering , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[4]  Daniel M. Germán Using Software Distributions to Understand the Relationship among Free and Open Source Software Projects , 2007, Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007).

[5]  Wladyslaw M. Turski Reference Model for Smooth Growth of Software Systems(003)5402022 , 1996, IEEE Transactions on Software Engineering.

[6]  Roberto Di Cosmo,et al.  Managing the Complexity of Large Free and Open Source Package-Based Software Distributions , 2006, 21st IEEE/ACM International Conference on Automated Software Engineering (ASE'06).

[7]  Meir M. Lehman,et al.  Rules and Tools for Software Evolution Planning and Management , 2001, Ann. Softw. Eng..

[8]  Giancarlo Succi,et al.  Preliminary Results from an Empirical Study on the Growth of Open Source and Commercial Software Products , 2001 .

[9]  Dewayne E. Perry,et al.  Metrics and laws of software evolution-the nineties view , 1997, Proceedings Fourth International Software Metrics Symposium.

[10]  Lawrence Rosen,et al.  Open Source Licensing: Software Freedom and Intellectual Property Law , 2004 .

[11]  Martin Michlmayr Quality and the Reliance on Individuals in Free Software Projects , 2011 .

[12]  Michael W. Godfrey,et al.  Evolution in open source software: a case study , 2000, Proceedings 2000 International Conference on Software Maintenance.

[13]  Harald C. Gall,et al.  Software evolution observations based on product release history , 1997, 1997 Proceedings International Conference on Software Maintenance.

[14]  E. Burton Swanson,et al.  The dimensions of maintenance , 1976, ICSE '76.

[15]  Daniel M. Germán,et al.  A Model to Understand the Building and Running Inter-Dependencies of Software , 2007, 14th Working Conference on Reverse Engineering (WCRE 2007).

[16]  Meir M. Lehman,et al.  Program evolution: processes of software change , 1985 .

[17]  David A. Wheeler More Than a Gigabuck: Estimating GNU/Linux''s Size , 2002, WWW 2002.