Meditor: Inference and Application of API Migration Edits

Developers build programs based on software libraries. When a library evolves, programmers need to migrate their client code from the library's old release(s) to new release(s). Due to the API backwards incompatibility issues, such code migration may require developers to replace API usage and apply extra edits (e.g., statement insertions or deletions) to ensure the syntactic or semantic correctness of migrated code. Existing tools extract API replacement rules without handling the additional edits necessary to fulfill a migration task. This paper presents our novel approach, Meditor, which extracts and applies the necessary edits together with API replacement changes. Meditor has two phases: inference and application of migration edits. For edit inference, Meditor mines open source repositories for migration-related (MR) commits, and conducts program dependency analysis on changed Java files to locate and cluster MR code changes. From these changes, Meditor further generalizes API migration edits by abstracting away unimportant details (e.g., concrete variable identifiers). For edit application, Meditor matches a given program with inferred edits to decide which edit is applicable, customizes each applicable edit, and produces a migrated version for developers to review. We applied Meditor to four popular libraries: Lucene, CraftBukkit, Android SDK, and Commons IO. By searching among 602,249 open source projects on GitHub, Meditor identified 1,368 unique migration edits. Among these edits, 885 edits were extracted from single updated statements, while the other 483 more complex edits were from multiple co-changed statements. We sampled 937 inferred edits for manual inspection and found all of them to be correct. Our evaluation shows that Meditor correctly applied code migrations in 218 out of 225 cases. This research will help developers automatically adapt client code to different library versions.

[1]  Hridesh Rajan,et al.  Exploiting implicit beliefs to resolve sparse usage problem in usage-based specification mining , 2017, Proc. ACM Program. Lang..

[2]  Felix A. Fischer,et al.  How Reliable is the Crowdsourced Knowledge of Security Implementation? , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[3]  Eleni Stroulia,et al.  UMLDiff: an algorithm for object-oriented design differencing , 2005, ASE.

[4]  Jeff H. Perkins,et al.  Automatically generating refactorings to support API evolution , 2005, PASTE '05.

[5]  Anh Tuan Nguyen,et al.  Statistical learning approach for mining API usage mappings for code migration , 2014, ASE.

[6]  Wei Wu,et al.  AURA: a hybrid approach to identify framework evolution , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[7]  Gabriele Bavota,et al.  The Evolution of Project Inter-dependencies in a Software Ecosystem: The Case of Apache , 2013, 2013 IEEE International Conference on Software Maintenance.

[8]  Qing Wang,et al.  Mining API mapping for language migration , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[9]  Hoan Anh Nguyen,et al.  Graph-based mining of multiple object usage patterns , 2009, ESEC/FSE '09.

[10]  Miryung Kim,et al.  An Empirical Study of API Stability and Adoption in the Android Ecosystem , 2013, 2013 IEEE International Conference on Software Maintenance.

[11]  Mira Mezini,et al.  Mining framework usage changes from instantiation code , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[12]  Tao Xie,et al.  Inferring method specifications from natural language API descriptions , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[13]  Anh Tuan Nguyen,et al.  Lexical statistical machine translation for language migration , 2013, ESEC/FSE 2013.

[14]  Ralph E. Johnson,et al.  The role of refactorings in API evolution , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[15]  Julia L. Lawall,et al.  Understanding collateral evolution in Linux device drivers , 2006, EuroSys '06.

[16]  Harald C. Gall,et al.  Change Distilling:Tree Differencing for Fine-Grained Source Code Change Extraction , 2007, IEEE Transactions on Software Engineering.

[17]  Shuvendu K. Lahiri,et al.  Helping Developers Help Themselves: Automatic Decomposition of Code Review Changesets , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[18]  Tao Xie,et al.  An Empirical Study on Evolution of API Documentation , 2011, FASE.

[19]  Hong Mei,et al.  An Empirical Study on API Usages , 2019, IEEE Transactions on Software Engineering.

[20]  Miryung Kim,et al.  A graph-based approach to API usage adaptation , 2010, OOPSLA.

[21]  Andreas Zeller,et al.  Mining temporal specifications from object usage , 2011, Automated Software Engineering.

[22]  Zhendong Su,et al.  Javert: fully automatic mining of general temporal properties from dynamic traces , 2008, SIGSOFT '08/FSE-16.

[23]  Danny Dig,et al.  API code recommendation using statistical learning from fine-grained changes , 2016, SIGSOFT FSE.

[24]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[25]  Sunghun Kim,et al.  Properties of Signature Change Patterns , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[26]  Jens Dietrich,et al.  Broken promises: An empirical study into evolution problems in Java programs caused by library upgrades , 2014, 2014 Software Evolution Week - IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE).

[27]  Marco Tulio Valente,et al.  How do developers react to API evolution? The Pharo ecosystem case , 2015, 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[28]  J. Henkel,et al.  CatchUp! Capturing and replaying refactorings to support API evolution , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[29]  Zhenchang Xing,et al.  Refactoring Practice: How it is and How it Should be Supported - An Eclipse Case Study , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[30]  Mukund Raghothaman,et al.  SWIM: Synthesizing What I Mean - Code Search and Idiomatic Snippet Synthesis , 2015, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[31]  Andreas Zeller,et al.  The impact of tangled code changes on defect prediction models , 2015, Empirical Software Engineering.

[32]  Eleni Stroulia,et al.  API-Evolution Support with Diff-CatchUp , 2007, IEEE Transactions on Software Engineering.

[33]  André van der Hoek,et al.  CodeExchange: Supporting Reformulation of Internet-Scale Code Queries in Context (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[34]  Anh Tuan Nguyen,et al.  Divide-and-Conquer Approach for Multi-phase Statistical Migration for Source Code (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[35]  Lu Zhang,et al.  A history-based matching approach to identification of framework evolution , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[36]  Michael Backes,et al.  Stack Overflow Considered Harmful? The Impact of Copy&Paste on Android Application Security , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[37]  Arie van Deursen,et al.  Semantic Versioning versus Breaking Changes: A Study of the Maven Repository , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[38]  Robert J. Walker,et al.  Seeking the ground truth: a retroactive study on the evolution and migration of software libraries , 2012, SIGSOFT FSE.

[39]  Xiaodong Gu,et al.  Deep API learning , 2016, SIGSOFT FSE.

[40]  Matús Sulír,et al.  A quantitative study of Java software buildability , 2016, PLATEAU@SPLASH.

[41]  David Notkin,et al.  Semi-automatic update of applications in response to library changes , 1996, 1996 Proceedings of International Conference on Software Maintenance.