Automated software transplantation

Automated transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system. This paper introduces a theory, an algorithm, and a tool that achieve this. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. While we do not claim automated transplantation is now a solved problem, our results are encouraging: we report that in 12 of 15 experiments, involving 5 donors and 3 hosts (all popular real-world systems), we successfully autotransplanted new functionality and passed all regression tests. Autotransplantation is also already useful: in 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player; compare this to upgrading x264 within VLC, a task that we estimate, from VLC's version history, took human programmers an average of 20 days of elapsed, as opposed to dedicated, time.

[1]  Zohar Manna,et al.  Toward automatic program synthesis , 1971, Symposium on Semantics of Algorithmic Languages.

[2]  Mark David Weiser,et al.  Program slices: formal, psychological, and practical investigations of an automatic program abstraction method , 1979 .

[3]  Joseph E. Stoy,et al.  Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory , 1981 .

[4]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[5]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[6]  Joseph Robert Horgan,et al.  Dynamic program slicing , 1990, PLDI '90.

[7]  Boris Beizer,et al.  Software testing techniques (2. ed.) , 1990 .

[8]  G. A. Venkatesh,et al.  The semantic approach to program slicing , 1991, PLDI '91.

[9]  Samson Abramsky,et al.  Handbook of logic in computer science. , 1992 .

[10]  Aniello Cimitile,et al.  Reuse reengineering and validation via concept assignment , 1993, 1993 Conference on Software Maintenance.

[11]  David Eichmann,et al.  Program and interface slicing for reverse engineering , 1993, [1993] Proceedings Working Conference on Reverse Engineering.

[12]  Dov M. Gabbay,et al.  Handbook of logic in computer science. Volume 3. Semantic Structures , 1995 .

[13]  Aniello Cimitile,et al.  Software salvaging based on conditions , 1994, Proceedings 1994 International Conference on Software Maintenance.

[14]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[15]  Gerardo Canfora,et al.  An Integrated Environment for Reuse Reengineering C Code , 1996, SEKE.

[16]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[17]  Giuseppe Visaggio,et al.  Extracting Reusable Funtions by Flow Graph-Based Program Slicing , 1997, IEEE Trans. Software Eng..

[18]  Mark Harman,et al.  Amorphous program slicing , 1997, Proceedings Fifth International Workshop on Program Comprehension. IWPC'97.

[19]  Aniello Cimitile,et al.  Conditioned program slicing , 1998, Inf. Softw. Technol..

[20]  Mary Jean Harrold,et al.  Reuse-driven interprocedural slicing , 1998, Proceedings of the 20th International Conference on Software Engineering.

[21]  Aniello Cimitile,et al.  Decomposing legacy programs: a first step towards migrating to client-server platforms , 1998, Proceedings. 6th International Workshop on Program Comprehension. IWPC'98 (Cat. No.98TB100242).

[22]  Arie van Deursen,et al.  Identifying objects using cluster and concept analysis , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[23]  Susan Horwitz,et al.  Semantics-preserving procedure extraction , 2000, POPL '00.

[24]  Mark Harman,et al.  An overview of program slicing , 2001, Softw. Focus.

[25]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[26]  Mark Harman,et al.  Code extraction algorithms which unify slicing and concept assignment , 2002, Ninth Working Conference on Reverse Engineering, 2002. Proceedings..

[27]  Jens Krinke,et al.  Barrier slicing and chopping , 2003, Proceedings Third IEEE International Workshop on Source Code Analysis and Manipulation.

[28]  Rainer Koschke,et al.  Locating Features in Source Code , 2003, IEEE Trans. Software Eng..

[29]  Robert J. Hall Automatic extraction of executable program subsets by simultaneous dynamic program slicing , 2004, Automated Software Engineering.

[30]  Mark Harman,et al.  A survey of empirical results on program slicing , 2004, Adv. Comput..

[31]  Mark Harman,et al.  CONSIT: a fully automated conditioned program slicer , 2004, Softw. Pract. Exp..

[32]  James R. Cordy,et al.  The TXL source transformation language , 2006, Sci. Comput. Program..

[33]  Giuliano Antoniol,et al.  Comparison and Evaluation of Clone Detection Tools , 2007, IEEE Transactions on Software Engineering.

[34]  Sigrid Eldh Software Testing Techniques , 2007 .

[35]  Gerardo Canfora,et al.  New Frontiers of Reverse Engineering , 2007, Future of Software Engineering (FOSE '07).

[36]  Xin Yao,et al.  A novel co-evolutionary approach to automatic software bug fixing , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[37]  Mark Harman,et al.  Automated Test Data Generation for Coverage: Haven't We Solved This Problem Yet? , 2009, 2009 Testing: Academic and Industrial Conference - Practice and Research Techniques.

[38]  Claire Le Goues,et al.  Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[39]  Mark Harman,et al.  Evolving a CUDA kernel from an nVidia template , 2010, IEEE Congress on Evolutionary Computation.

[40]  Jason Lawrence,et al.  Genetic programming for shader simplification , 2011, ACM Trans. Graph..

[41]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[42]  John A. Clark,et al.  Evolutionary Improvement of Programs , 2011, IEEE Transactions on Evolutionary Computation.

[43]  Jens Krinke,et al.  Is cloned code older than non-cloned code? , 2011, IWSC '11.

[44]  Moshe Sipper,et al.  Flight of the FINCH Through the Java Wilderness , 2011, IEEE Transactions on Evolutionary Computation.

[45]  Sumit Gulwani,et al.  Automating string processing in spreadsheets using input-output examples , 2011, POPL '11.

[46]  Andrew Walenstein,et al.  In situ reuse of logically extracted functional components , 2012, Journal in Computer Virology.

[47]  John A. Clark,et al.  The GISMOE challenge: constructing the pareto program surface using genetic programming to find better programs (keynote paper) , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[48]  Josep Silva,et al.  A vocabulary of program slicing-based techniques , 2012, CSUR.

[49]  Miryung Kim,et al.  Detecting and characterizing semantic inconsistencies in ported code , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[50]  Mark Harman,et al.  Observation-Based Slicing , 2013 .

[51]  Mark Harman,et al.  Genetic programming for Reverse Engineering , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[52]  Claire Le Goues,et al.  Current challenges in automatic software repair , 2013, Software Quality Journal.

[53]  Mark Harman,et al.  Searching for better configurations: a rigorous approach to clone evaluation , 2013, ESEC/FSE 2013.

[54]  Mark Harman,et al.  Babel Pidgin: SBSE Can Grow and Graft Entirely New Functionality into a Real World System , 2014, SSBSE.

[55]  Mark Harman,et al.  Using Genetic Improvement and Code Transplants to Specialise a C++ Program to a Problem Class , 2014, EuroGP.

[56]  Mark Harman,et al.  Genetically Improved CUDA C++ Software , 2014, EuroGP.

[57]  Mark Harman,et al.  Improving 3D medical image registration CUDA software with genetic programming , 2014, GECCO.

[58]  Martin Rinard,et al.  Automatic Error Elimination by Multi-Application Code Transfer , 2014 .

[59]  Reid Holmes,et al.  Coverage is not strongly correlated with test suite effectiveness , 2014, ICSE.

[60]  Mark Harman,et al.  ORBS: language-independent program slicing , 2014, SIGSOFT FSE.

[61]  Michael G. Epitropakis,et al.  Gen-O-Fix: An embeddable framework for Dynamic Adaptive Genetic Improvement Programming , 2014 .

[62]  Fan Long,et al.  An analysis of patch plausibility and correctness for generate-and-validate patch generation systems , 2015, ISSTA.

[63]  Eric Lahtinen,et al.  Automatic error elimination by horizontal code transfer across multiple applications , 2015, PLDI.

[64]  Mark Harman,et al.  Ieee Transactions on Evolutionary Computation 1 , 2022 .