Discovering and representing systematic code changes

Software engineers often inspect program differences when reviewing others' code changes, when writing check-in comments, or when determining why a program behaves differently from expected behavior after modification. Program differencing tools that support these tasks are limited in their ability to group related code changes or to detect potential inconsistencies in those changes. To overcome these limitations and to complement existing approaches, we built Logical Structural Diff (LSdiff), a tool that infers systematic structural differences as logic rules. LSdiff notes anomalies from systematic changes as exceptions to the logic rules. We conducted a focus group study with professional software engineers in a large E-commerce company; we also compared LSdiff's results with textual differences and with structural differences without rules. Our evaluation suggests that LSdiff complements existing differencing tools by grouping code changes that form systematic change patterns regardless of their distribution throughout the code, and its ability to discover anomalies shows promise in detecting inconsistent changes.

[1]  Miryung Kim,et al.  An empirical study of code clone genealogies , 2005, ESEC/FSE-13.

[2]  E. Balas,et al.  Set Partitioning: A survey , 1976 .

[3]  Paolo Tonella,et al.  A Survey of Automated Code-Level Aspect Mining Techniques , 2007, LNCS Trans. Aspect Oriented Softw. Dev..

[4]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[5]  Richard C. Holt Structural manipulations of software architecture using Tarski relational algebra , 1998, Proceedings Fifth Working Conference on Reverse Engineering (Cat. No.98TB100261).

[6]  Elnar Hajiyev,et al.  codeQuest: Scalable Source Code Queries with Datalog , 2006, ECOOP.

[7]  Kris De Volder,et al.  Navigating and querying code without getting lost , 2003, AOSD '03.

[8]  Gail C. Murphy,et al.  Predicting source code changes by mining change history , 2004, IEEE Transactions on Software Engineering.

[9]  Pedro M. Domingos,et al.  Learning the structure of Markov logic networks , 2005, ICML.

[10]  Martin P. Robillard,et al.  Inferring structural patterns for concern traceability in evolving software , 2007, ASE.

[11]  Robert DeLine,et al.  Information Needs in Collocated Software Development Teams , 2007, 29th International Conference on Software Engineering (ICSE'07).

[12]  Miryung Kim,et al.  Program element matching for multi-version program analyses , 2006, MSR '06.

[13]  Miryung Kim,et al.  Analyzing and inferring the structure of code change , 2008 .

[14]  Marti A. Hearst,et al.  Aligning development tools with the way programmers think about code changes , 2007, CHI.

[15]  Harald C. Gall,et al.  Detection of logical coupling based on product release history , 1998, Proceedings. International Conference on Software Maintenance (Cat. No. 98CB36272).

[16]  H. Edmunds The Focus Group Research Handbook , 1999 .

[17]  Martin Erwig,et al.  A rule-based language for programming software updates , 2002, ACM SIGPLAN Workshop on Rule-Based Programming.

[18]  Kris De Volder,et al.  Type-Oriented Logic Meta Programming , 1998 .

[19]  Julia L. Lawall,et al.  Documenting and automating collateral evolutions in linux device drivers , 2008, Eurosys '08.

[20]  Andreas Zeller,et al.  Mining version histories to guide software changes , 2005, Proceedings. 26th International Conference on Software Engineering.

[21]  Wuu Yang,et al.  Identifying syntactic differences between two programs , 1991, Softw. Pract. Exp..

[22]  Cristina V. Lopes,et al.  Aspect-oriented programming , 1999, ECOOP Workshops.

[23]  Barbara G. Ryder,et al.  Crisp: a debugging tool for Java programs , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[24]  Alessandro Orso,et al.  A differencing algorithm for object-oriented programs , 2004 .

[25]  Miryung Kim,et al.  Automatic Inference of Structural Changes for Matching across Program Versions , 2007, 29th International Conference on Software Engineering (ICSE'07).

[26]  Stanley M. Sutton,et al.  N degrees of separation: multi-dimensional separation of concerns , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[27]  Michael Eichberg,et al.  Defining and continuous checking of structural program dependencies , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[28]  James R. Cordy,et al.  The TXL source transformation language , 2006, Sci. Comput. Program..

[29]  Mira Mezini,et al.  Mining framework usage changes from instantiation code , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[30]  Tom Mens,et al.  Maintaining software through intentional source-code views , 2002, SEKE '02.

[31]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[32]  Frank Tip,et al.  Chianti: a tool for change impact analysis of java programs , 2004, OOPSLA.