Growing up with stability: How open-source relational databases evolve

Like all software systems, databases are subject to evolution as time passes. The impact of this evolution can be vast as a change to the schema of a database can affect the syntactic correctness and the semantic validity of all the surrounding applications. In this paper, we have performed a thorough, large-scale study on the evolution of databases that are part of larger open source projects, publicly available through open source repositories. Lehman?s laws of software evolution, a well-established set of observations on how the typical software systems evolve (matured during the last forty years), has served as our guide towards providing insights on the mechanisms that govern schema evolution. Much like software systems, we found that schemata expand over time, under a stabilization mechanism that constraints uncontrolled expansion with perfective maintenance. At the same time, unlike typical software systems, the growth is typically low, with long periods of calmness interrupted by bursts of maintenance and a surprising lack of complexity increase.

[1]  George Papastefanatos,et al.  Policy-Regulated Management of ETL Evolution , 2009, J. Data Semant..

[2]  Richard E. Fairley,et al.  Guide to the Software Engineering Body of Knowledge (SWEBOK(R)): Version 3.0 , 2014 .

[3]  Dewayne E. Perry,et al.  On evidence supporting the FEAST hypothesis and the laws of software evolution , 1998, Proceedings Fifth International Software Metrics Symposium. Metrics (Cat. No.98TB100262).

[4]  Meir M. Lehman,et al.  A Model of Large Program Development , 1976, IBM Syst. J..

[5]  Rajesh Vasa,et al.  Growth and Change Dynamics in Open Source Software Systems , 2010 .

[6]  Shari Lawrence Pfleeger,et al.  Software Metrics : A Rigorous and Practical Approach , 1998 .

[7]  Serge Demeyer,et al.  Software Evolution , 2010 .

[8]  Qiang Tu,et al.  Growth, evolution, and structural change in open source software , 2001, IWPSE '01.

[9]  Iulian Neamtiu,et al.  Schema evolution analysis for embedded databases , 2011, 2011 IEEE 27th International Conference on Data Engineering Workshops.

[10]  George Papastefanatos,et al.  Metrics for the Prediction of Evolution Impact in ETL Ecosystems: A Case Study , 2012, Journal on Data Semantics.

[11]  T. S. P. S.,et al.  GROWTH , 1924, Nature.

[12]  Jesús M. González-Barahona,et al.  The evolution of the laws of software evolution , 2013, ACM Comput. Surv..

[13]  George Papastefanatos,et al.  HECATAEUS: Regulating schema evolution , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[14]  Dong Qiu,et al.  An empirical analysis of the co-evolution of schema and code in database applications , 2013, ESEC/FSE 2013.

[15]  Manoranjan Satpathy,et al.  Latitudinal and longitudinal process diversity , 2003, J. Softw. Maintenance Res. Pract..

[16]  Jesús M. González-Barahona,et al.  Evolution and growth in large libre software projects , 2005, Eighth International Workshop on Principles of Software Evolution (IWPSE'05).

[17]  David D. McDonald,et al.  Programs , 1984, CL.

[18]  Dewayne E. Perry,et al.  Software Evolution and Feedback: Theory and Practice , 2006 .

[19]  K. H. Bennett,et al.  Journal of software maintenance : research and practice , 1989 .

[20]  Carlo Curino,et al.  Automating the database schema evolution process , 2012, The VLDB Journal.

[21]  D. Sjøberg,et al.  Quantifying schema evolution , 1993, Inf. Softw. Technol..

[22]  Salima Benbernou,et al.  A survey on service quality description , 2013, CSUR.

[23]  ZanioloCarlo,et al.  Graceful database schema evolution , 2008, VLDB 2008.

[24]  Yijun Yu,et al.  Design principles in architectural evolution: A case study , 2008, 2008 IEEE International Conference on Software Maintenance.

[25]  Eleni Stroulia,et al.  Analyzing the evolutionary history of the logical design of object-oriented software , 2005, IEEE Transactions on Software Engineering.

[26]  Meir M. Lehman,et al.  Laws of Software Evolution Revisited , 1996, EWSPT.

[27]  M. J. Lawrence,et al.  An examination of evolution dynamics , 1982, ICSE '82.

[28]  Dror G. Feitelson,et al.  The Linux kernel as a case study in software evolution , 2010, J. Syst. Softw..

[29]  Carlo Curino,et al.  Graceful database schema evolution: the PRISM workbench , 2008, Proc. VLDB Endow..

[30]  Dewayne E. Perry,et al.  Metrics and laws of software evolution-the nineties view , 1997, Proceedings Fourth International Software Metrics Symposium.

[31]  Iulian Neamtiu,et al.  Collateral evolution of applications and databases , 2009, IWPSE-Evol '09.

[32]  Iulian Neamtiu,et al.  Towards a better understanding of software evolution: An empirical study on open source software , 2009, 2009 IEEE International Conference on Software Maintenance.

[33]  Carlo Curino,et al.  Schema Evolution in Wikipedia - Toward a Web Information System Benchmark , 2008, ICEIS.

[34]  Apostolos V. Zarras,et al.  Open-Source Databases: Within, Outside, or Beyond Lehman's Laws of Software Evolution? , 2014, CAiSE.

[35]  M.M. Lehman,et al.  Programs, life cycles, and laws of software evolution , 1980, Proceedings of the IEEE.

[36]  Meir M. Lehman,et al.  Rules and Tools for Software Evolution Planning and Management , 2001, Ann. Softw. Eng..

[37]  A Straw,et al.  Guide to the Software Engineering Body of Knowledge , 1998 .

[38]  Jesús M. González-Barahona,et al.  Comparison between SLOCs and number of files as size metrics for software evolution analysis , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[39]  Stefan Koch,et al.  Software evolution in open source projects - a large-scale investigation , 2007, J. Softw. Maintenance Res. Pract..

[40]  Shari Lawrence Pfleeger,et al.  Software metrics - a practical and rigorous approach (2. ed.) , 1996 .

[41]  Shamim Sharfuddin Pirzada A statistical examination of the evolution of the UNIX system , 1998 .

[42]  Michel Wermelinger,et al.  Empirical Studies of Open Source Evolution , 2008, Software Evolution.

[43]  Sofware Evolution And... Software evolution and feedback : theory and practice , 2014 .

[44]  Michael W. Godfrey,et al.  Evolution in open source software: a case study , 2000, Proceedings 2000 International Conference on Software Maintenance.

[45]  M M Lehman,et al.  Software Evolution , 2002 .