Evaluating design decay during software evolution

Software systems evolve, requiring continuous maintenance and development. They undergo changes throughout their lifetimes as new features are added and bugs are fixed. As these systems evolved, their designs tend to decay with time and become less adaptable to changing users’ requirements. Consequently, software designs become more complex over time and harder to maintain; in some not-so-rare cases, developers prefer redesigning from scratch rather than prolonging the life of existing designs, which causes development and maintenance costs to rise. Therefore, developers must understand the factors that drive the decay of their designs and take proactive steps that facilitate future changes and slow down decay. Design decay occurs when changes are made on a software system by developers who do not understand its original design. On the one hand, making software changes without understanding their effects may lead to the introduction of bugs and the premature retirement of the system. On the other hand, when developers lack knowledge and–or experience in solving a design problem, they may introduce design defects, which are conjectured to have a negative impact on the evolution of systems, which leads to design decay. Thus, developers need mechanisms to understand how a change to a system will impact the rest of the system and tools to detect design defects. In this dissertation, we propose three principal contributions. The first contribution aims to evaluate design decay. Measuring design decay consists of using a diagram matching technique to identify structural changes among versions of a design, such as a class diagram. Finding structural changes occurring in long-lived, evolving designs requires the identification of class renamings. Thus, the first step of our approach concerns the identification of class renamings in evolving designs. Then, the second step requires to match several versions of an evolving design to identify decaying and stable parts of the design. We propose bit-vector and incremental clustering algorithms to match several versions of an evolving design. The third step consists of measuring design decay. We propose a set of metrics to evaluate this design decay. The second contribution is related to change impact analysis. We present a new metaphor inspired from seismology to identify the change impact. In particular, our approach considers changes to a class as an earthquake that propagates through a long chain of intermediary classes. Our approach combines static dependencies between classes and historical co-change relations to measure the scope of change propagation in a system, i.e., how far a change propagation will proceed from a “changed class” to other classes. The third contribution concerns design defect detection. We propose a metaphor inspired from a natural immune system. Like any living creature, designs are subject to diseases, which are design defects. Detection approaches are defense mechanisms of designs. A natural immune system can detect similar pathogens with good precision. This good precision has inspired a family of classification algorithms, Artificial Immune Systems (AIS) algorithms, which we use to detect design defects. The three contributions are evaluated on open-source object-oriented systems and the obtained results enable us to draw the following conclusions: • Design decay metrics, Tunnel Triplets Metric (TTM) and Common Triplets Metric ( CTM), provide developers useful insights regarding design decay. If TTM decreases, then the original design decays. If TTM is stable, then the original design is stable, which means that the system is more adapted to the new changing requirements. • Seismology provides an interesting metaphor for change impact analysis. Changes propagate in systems, like earthquakes. The change impact is most severe near the changed class and drops off away from the changed class. Using external information, we show that our approach helps developers to locate easily the change impact. • Immune system provides an interesting metaphor for detecting design defects. The results of the experiments showed that the precision and recall of our approach are comparable or superior to that of previous approaches.

[1]  Jeffrey C. Carver,et al.  Characterizing Software Architecture Changes: An Initial Study , 2007, ESEM 2007.

[2]  Daniel M. Germán,et al.  Change impact graphs: Determining the impact of prior codechanges , 2009, Inf. Softw. Technol..

[3]  Giuliano Antoniol,et al.  An automatic approach to identify class evolution discontinuities , 2004 .

[4]  Anne Bergeron,et al.  Vector Algorithms for Approximate String Matching , 2002, Int. J. Found. Comput. Sci..

[5]  Forrest Shull,et al.  Detecting defects in object-oriented designs: using reading techniques to increase software quality , 1999, OOPSLA '99.

[6]  Robert S. Arnold,et al.  Software Change Impact Analysis , 1996 .

[7]  B. McCune,et al.  Analysis of Ecological Communities , 2002 .

[8]  Tony White,et al.  Increasing the accuracy of a spam-detecting artificial immune system , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[9]  Ghizlane El Boussaidi,et al.  Developpement logiciel par transformation de modeles , 2010 .

[10]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[11]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[12]  Thomas M. Pigoski Practical Software Maintenance: Best Practices for Managing Your Software Investment , 1996 .

[13]  Daniela Cruzes,et al.  The evolution and impact of code smells: A case study of two open source systems , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[14]  Richard C. Holt,et al.  Predicting change propagation in software systems , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[15]  Ladan Tahvildari,et al.  Using Bayesian Belief Networks to Predict Change Propagation in Software Systems , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[16]  Mehdi Jazayeri On Architectural Stability and Evolution , 2002, Ada-Europe.

[17]  Alicia Troncoso Lora,et al.  Pattern recognition to forecast seismic time series , 2010, Expert Syst. Appl..

[18]  Jason Brownlee,et al.  Immunos-81 : the misunderstood artificial immune system , 2005 .

[19]  Loris Nanni,et al.  Generalized Needleman-Wunsch algorithm for the recognition of T-cell epitopes , 2008, Expert Syst. Appl..

[20]  Mary Shaw,et al.  An Introduction to Software Architecture , 1993, Advances in Software Engineering and Knowledge Engineering.

[21]  J. Larus Whole program paths , 1999, PLDI '99.

[22]  Tom Mens,et al.  Does God Class Decomposition Affect Comprehensibility? , 2006, IASTED Conf. on Software Engineering.

[23]  Yann-Gaël Guéhéneuc Ptidej : Promoting Patterns with Patterns , 2005 .

[24]  Dick Hamlet,et al.  The engineering of software : technical foundations for the individual , 2001 .

[25]  Giuliano Antoniol,et al.  Maintaining traceability links during object‐oriented software evolution , 2001, Softw. Pract. Exp..

[26]  Yann-Gaël Guéhéneuc,et al.  An exploratory study of identifier renamings , 2011, MSR '11.

[27]  Yann-Gaël Guéhéneuc,et al.  Studying software evolution of large object‐oriented software systems using an ETGM algorithm , 2013, J. Softw. Evol. Process..

[28]  Robert Moreton A process model for software maintenance , 1990, J. Inf. Technol..

[29]  Houari A. Sahraoui,et al.  Deviance from perfection is a better criterion than closeness to evil when identifying risky code , 2010, ASE.

[30]  Thomas J. Mowbray,et al.  AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis , 1998 .

[31]  Xiangyu Zhang,et al.  A study of effectiveness of dynamic slicing in locating real faults , 2006, Empirical Software Engineering.

[32]  Ivar Jacobson,et al.  Object Design: Roles, Responsibilities, and Collaborations , 2002 .

[33]  David Notkin,et al.  Software reflexion models: bridging the gap between source and high-level models , 1995, SIGSOFT FSE.

[34]  Richard Gisselquist Engineering in software , 1998, CACM.

[35]  Aiko Fallas Yamashita,et al.  Do code smells reflect important maintainability aspects? , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[36]  C. Lewerentz,et al.  Metrics based refactoring , 2001, Proceedings Fifth European Conference on Software Maintenance and Reengineering.

[37]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[38]  Jerome H. Carter Research Paper: The Immune System as a Model for Pattern Recognition and Classification , 2000, J. Am. Medical Informatics Assoc..

[39]  Martin Gogolla Unified Modeling Language , 2009, Encyclopedia of Database Systems.

[40]  Andreas Zeller,et al.  Mining Version Histories to Guide Software Changes , 2004 .

[41]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[42]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[43]  Alessandro F. Garcia,et al.  On the Relevance of Code Anomalies for Identifying Architecture Degradation Symptoms , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[44]  Audris Mockus,et al.  Does Code Decay? Assessing the Evidence from Change Management Data , 2001, IEEE Trans. Software Eng..

[45]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[46]  S C Myers,et al.  Using epicenter location to differentiate events from natural background seismicity , 1999 .

[47]  Yu Zhou,et al.  A Bayesian Network Based Approach for Change Coupling Prediction , 2008, 2008 15th Working Conference on Reverse Engineering.

[48]  D. Sornette,et al.  Theory of earthquake recurrence times , 2006, physics/0606001.

[49]  Viv Bewick,et al.  Statistics review 9: One-way analysis of variance , 2004, Critical care.

[50]  Michael W. Godfrey,et al.  Secrets from the Monster: Extracting Mozilla’s Software Architecture , 2000 .

[51]  Yann-Gaël Guéhéneuc,et al.  Change Impact Analysis: An Earthquake Metaphor , 2011, 2011 IEEE 19th International Conference on Program Comprehension.

[52]  Eleni Stroulia,et al.  Understanding class evolution in object-oriented software , 2004, Proceedings. 12th IEEE International Workshop on Program Comprehension, 2004..

[53]  Giuliano Antoniol,et al.  Recovering the Evolution Stable Part Using an ECGM Algorithm: Is There a Tunnel in Mozilla? , 2009, 2009 13th European Conference on Software Maintenance and Reengineering.

[54]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[55]  Rick Kazman,et al.  The perils of reconstructing architectures , 1998, ISAW '98.

[56]  Jonathan Timmis,et al.  Artificial Immune Systems : Using the Immune System as Inspiration for Data Mining , 2001 .

[57]  Yann-Gaël Guéhéneuc,et al.  Fingerprinting design patterns , 2004, 11th Working Conference on Reverse Engineering.

[58]  Shawn A. Bohner,et al.  Impact analysis-Towards a framework for comparison , 1993, 1993 Conference on Software Maintenance.

[59]  Yann-Gaël Guéhéneuc,et al.  A seismology-inspired approach to study change propagation , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[60]  Shawn A. Bohner,et al.  A framework for software maintenance metrics , 1990, Proceedings. Conference on Software Maintenance 1990.

[61]  Jan Bosch,et al.  Design erosion: problems and causes , 2002, J. Syst. Softw..

[62]  Daniel M. Yellin,et al.  Bayesian Approaches to Matching Architectural Diagrams , 2010, IEEE Transactions on Software Engineering.

[63]  Chris F. Kemerer,et al.  A Metrics Suite for Object Oriented Design , 2015, IEEE Trans. Software Eng..

[64]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[65]  Gerardo Canfora,et al.  Impact analysis by mining software and change request repositories , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[66]  Bruce F. Webster,et al.  Pitfalls of object-oriented development , 1995 .

[67]  Michael B. Spring,et al.  Shared defect detection: the effects of annotations in asynchronous software inspection , 2000 .

[68]  Salvatore Mamone,et al.  The IEEE standard for software maintenance , 1994, SOEN.

[69]  Raúl A. Santelices,et al.  Probabilistic Slicing for Predictive Impact Analysis , 2010 .

[70]  Meir M. Lehman,et al.  Laws of Software Evolution Revisited , 1996, EWSPT.

[71]  Raed Shatnawi,et al.  An empirical study of the bad smells and class error probability in the post-release object-oriented system evolution , 2007, J. Syst. Softw..

[72]  Foutse Khomh,et al.  An Exploratory Study of the Impact of Code Smells on Software Change-proneness , 2009, 2009 16th Working Conference on Reverse Engineering.

[73]  Miryung Kim,et al.  Automatic Inference of Structural Changes for Matching across Program Versions , 2007, 29th International Conference on Software Engineering (ICSE'07).

[74]  William C. Wake,et al.  Refactoring Workbook , 2003 .

[75]  Foutse Khomh,et al.  A Bayesian Approach for the Detection of Code and Design Smells , 2009, 2009 Ninth International Conference on Quality Software.

[76]  Victor R. Basili,et al.  A Methodology for Collecting Valid Software Engineering Data , 1984, IEEE Transactions on Software Engineering.

[77]  Christopher G. Lasater,et al.  Design Patterns , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[78]  James L. Wright,et al.  Source code that talks: an exploration of Eclipse task comments and their implication to repository mining , 2005, MSR '05.

[79]  Alfred V. Aho,et al.  Do Crosscutting Concerns Cause Defects? , 2008, IEEE Transactions on Software Engineering.

[80]  Mika Mäntylä,et al.  A taxonomy and an initial empirical study of bad smells in code , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[81]  Toon Calders,et al.  Applying Webmining techniques to execution traces to support the program comprehension process , 2005, Ninth European Conference on Software Maintenance and Reengineering.

[82]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[83]  Radu Marinescu,et al.  Detection strategies: metrics-based rules for detecting design flaws , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[84]  K. E. Bullen,et al.  An Introduction to the Theory of Seismology , 1964 .

[85]  Sjaak Brinkkemper,et al.  Journal of Software Maintenance and Evolution: Research and Practice Design Preservation over Subsequent Releases of a Software Product: a Case Study of Baan Erp , 2022 .

[86]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[87]  Yann-Gaël Guéhéneuc,et al.  DeMIMA: A Multilayered Approach for Design Pattern Identification , 2008, IEEE Transactions on Software Engineering.

[88]  L. Knopoff,et al.  Statistical Short-Term Earthquake Prediction , 1987, Science.

[89]  M.J. Munro,et al.  Product Metrics for Automatic Identification of "Bad Smell" Design Problems in Java Source-Code , 2005, 11th IEEE International Software Metrics Symposium (METRICS'05).

[90]  Banu Diri,et al.  Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem , 2009, Inf. Sci..

[91]  Dewayne E. Perry,et al.  Toward understanding the rhetoric of small source code changes , 2005, IEEE Transactions on Software Engineering.

[92]  Ahmed E. Hassan,et al.  Supporting software evolution using adaptive change propagation heuristics , 2008, 2008 IEEE International Conference on Software Maintenance.

[93]  Yann-Gaël Guéhéneuc,et al.  Efficient identification of design patterns with bit-vector algorithm , 2006, Conference on Software Maintenance and Reengineering (CSMR'06).

[94]  J. Looman,et al.  Adaptation of Sorensen's K (1948) for Estimating Unit Affinities in Prairie Vegetation , 1960 .

[95]  E. Burton Swanson,et al.  The dimensions of maintenance , 1976, ICSE '76.

[96]  David Lorge Parnas,et al.  Software aging , 1994, Proceedings of 16th International Conference on Software Engineering.

[97]  Mark Weiser,et al.  Programmers use slices when debugging , 1982, CACM.

[98]  Yann-Gaël Guéhéneuc,et al.  Extracting Change-patterns from CVS Repositories , 2006, 2006 13th Working Conference on Reverse Engineering.

[99]  Dewayne E. Perry,et al.  Metrics and laws of software evolution-the nineties view , 1997, Proceedings Fourth International Software Metrics Symposium.

[100]  Yann-Gaël Guéhéneuc,et al.  DECOR: A Method for the Specification and Detection of Code and Design Smells , 2010, IEEE Transactions on Software Engineering.

[101]  Leon Moonen,et al.  Java quality assurance by detecting code smells , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[102]  Michael W. Godfrey,et al.  Using origin analysis to detect merging and splitting of source code entities , 2005, IEEE Transactions on Software Engineering.

[103]  Gregg Rothermel,et al.  Whole program path-based dynamic impact analysis , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[104]  Eleni Stroulia,et al.  Analyzing the evolutionary history of the logical design of object-oriented software , 2005, IEEE Transactions on Software Engineering.

[105]  Mary Beth Rosson,et al.  Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications , 2000, Conference on Object-Oriented Programming Systems, Languages, and Applications.

[106]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[107]  Roger S. Pressman,et al.  Software Engineering: A Practitioner's Approach , 1982 .

[108]  Wei Wu,et al.  AURA: a hybrid approach to identify framework evolution , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[109]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[110]  Penny Grubb,et al.  Software Maintenance: Concepts and Practice , 2003 .

[111]  Gerardo Canfora,et al.  An eclectic approach for change impact analysis , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[112]  Václav Rajlich,et al.  Incremental change in object-oriented programming , 2004, IEEE Software.

[113]  Lorin Hochstein,et al.  Combating architectural degeneration: a survey , 2005, Inf. Softw. Technol..

[114]  Giuliano Antoniol,et al.  Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories , 2005, MSR.

[115]  Jan Bosch,et al.  Characterizing Evolution in Product Line Architectures , 1999, ICSE 1999.

[116]  Foutse Khomh,et al.  Numerical Signatures of Antipatterns: An Approach Based on B-Splines , 2010, 2010 14th European Conference on Software Maintenance and Reengineering.

[117]  Jason Brownlee,et al.  Artificial immune recognition system (AIRS): a review and analysis , 2005 .

[118]  Eleni Stroulia,et al.  API-Evolution Support with Diff-CatchUp , 2007, IEEE Transactions on Software Engineering.

[119]  Robert L. Nord,et al.  Applied Software Architecture , 1999, Addison Wesley object technology series.

[120]  Alexander L. Wolf,et al.  Acm Sigsoft Software Engineering Notes Vol 17 No 4 Foundations for the Study of Software Architecture , 2022 .

[121]  Will Venters,et al.  Software engineering: theory and practice , 2006 .

[122]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[123]  Douglas Bell Software Engineering: A Programming Approach , 1987 .

[124]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[125]  Yann-Gaël Guéhéneuc,et al.  Design evolution metrics for defect prediction in object oriented systems , 2010, Empirical Software Engineering.

[126]  Tom Mens,et al.  Design preservation over subsequent releases of a software product: a case study of Baan ERP: Practice Articles , 2005 .

[127]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .

[128]  Giuliano Antoniol,et al.  An automatic approach to identify class evolution discontinuities , 2004, Proceedings. 7th International Workshop on Principles of Software Evolution, 2004..

[129]  Yann-Gaël Guéhéneuc,et al.  Design Patterns as Laws of Quality , 2007 .

[130]  M. Cox The Numerical Evaluation of B-Splines , 1972 .

[131]  Carl G. Davis,et al.  A Hierarchical Model for Object-Oriented Design Quality Assessment , 2002, IEEE Trans. Software Eng..

[132]  Arthur J. Riel,et al.  Object-Oriented Design Heuristics , 1996 .

[133]  Stéphane Ducasse,et al.  Object-Oriented Metrics in Practice , 2005 .

[134]  Michele Lanza,et al.  Object-Oriented Metrics in Practice - Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems , 2006 .

[135]  Standard Glossary of Software Engineering Terminology , 1990 .

[136]  Yann-Gaël Guéhéneuc A Seismology-inspired Approach for Change Impact Analysis , 2011 .

[137]  Paul Clements,et al.  Software architecture in practice , 1999, SEI series in software engineering.

[138]  David W. Binkley,et al.  A large-scale empirical study of forward and backward static slice size and context sensitivity , 2003, International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings..

[139]  Mira Mezini,et al.  Mining framework usage changes from instantiation code , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[140]  Richard C. Holt,et al.  Replaying development history to assess the effectiveness of change propagation tools , 2006, Empirical Software Engineering.

[141]  Yann-Gaël Guéhéneuc,et al.  ADvISE: Architectural Decay in Software Evolution , 2012, 2012 16th European Conference on Software Maintenance and Reengineering.

[142]  William G. Cochran,et al.  Sampling Techniques, 3rd Edition , 1963 .

[143]  Stéphane Ducasse,et al.  Yesterday's Weather: guiding early reverse engineering efforts by summarizing the evolution of changes , 2004, 20th IEEE International Conference on Software Maintenance, 2004. Proceedings..

[144]  Foutse Khomh,et al.  IDS: An Immune-Inspired Approach for the Detection of Software Design Smells , 2010, 2010 Seventh International Conference on the Quality of Information and Communications Technology.

[145]  Hirotada Ohashi,et al.  RETRACTED: Discrimination-based Artificial Immune System: Modeling the Learning Mechanism of Self and Non-self Discrimination for Classification , 2007 .

[146]  Yann-Gaël Guéhéneuc,et al.  Recovering binary class relationships: putting icing on the UML cake , 2004, OOPSLA.