CPR: cross platform binary code reuse via platform independent trace program

The rapid growth of Internet of Things (IoT) has been created a number of new platforms recently. Unfortunately, such variety of IoT devices causes platform fragmentation which makes software development on such devices challenging. In particular, existing programs cannot be simply reused on such devices as they rely on certain underlying hardware and software interfaces which we call platform dependencies. In this paper, we present CPR, a novel technique that synthesizes a platform independent program from a platform dependent program. Specifically, we leverage an existing system called PIEtrace which can generate a platform independent trace program. The generated trace program is platform independent while it can only reproduce a specific execution path. Hence, we develop an algorithm to merge a set of platform independent trace programs and synthesize a general program that can take multiple inputs. The synthesized platform-independent program is representative of the merged trace programs and the results produced by the program is correct if no exceptions occur. Our evaluation results on 15 real-world applications show that CPR is highly effective on reusing existing binaries across platforms.

[1]  Eugene H. Spafford,et al.  Extending mutation testing to find environmental bugs , 1990, Softw. Pract. Exp..

[2]  Peter T. Breuer,et al.  Decompilation: the enumeration of types and grammars , 1994, TOPL.

[3]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[4]  Brian Walters,et al.  VMware Virtual Platform , 1999 .

[5]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[6]  HarrisTim,et al.  Xen and the art of virtualization , 2003 .

[7]  Min Xu,et al.  A "flight data recorder" for enabling full-system multiprocessor deterministic replay , 2003, ISCA '03.

[8]  Srikanth Kandula,et al.  Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging , 2004, USENIX Annual Technical Conference, General Track.

[9]  Mike Van Emmerik,et al.  Using a decompiler for real-world source recovery , 2004, 11th Working Conference on Reverse Engineering.

[10]  John S. Baras,et al.  ATEMU: a fine-grained sensor network simulator , 2004, 2004 First Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks, 2004. IEEE SECON 2004..

[11]  Satish Narayanasamy,et al.  BugNet: continuously recording program execution for deterministic replay debugging , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[12]  Jens Palsberg,et al.  Avrora: scalable sensor network simulation with precise timing , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[13]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[14]  Yasushi Saito,et al.  Jockey: a user-space library for record-replay debugging , 2005, AADEBUG'05.

[15]  Sanjay Bhansali,et al.  Framework for instruction-level tracing and analysis of program executions , 2006, VEE '06.

[16]  Yan Tang,et al.  Efficient checkpointing of java software using context-sensitive capture and replay , 2007, ESEC-FSE '07.

[17]  Tal Garfinkel,et al.  VMwareDecoupling Dynamic Program Analysis from Execution in Virtual Environments , 2008, USENIX Annual Technical Conference.

[18]  Xin Yao,et al.  A novel co-evolutionary approach to automatic software bug fixing , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[19]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[20]  Claire Le Goues,et al.  Automatically finding patches using genetic programming , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[21]  Ion Stoica,et al.  ODR: output-deterministic replay for multicore debugging , 2009, SOSP '09.

[22]  Angelos D. Keromytis,et al.  ASSURE: automatic software self-healing using rescue points , 2009, ASPLOS.

[23]  Christopher Krügel,et al.  Inspector Gadget: Automated Extraction of Proprietary Gadgets from Malware Binaries , 2010, 2010 IEEE Symposium on Security and Privacy.

[24]  Adrian Perrig,et al.  XTRec: Secure Real-Time Execution Trace Recording on Commodity Platforms , 2011, 2011 44th Hawaii International Conference on System Sciences.

[25]  John A. Clark,et al.  Evolutionary Improvement of Programs , 2011, IEEE Transactions on Evolutionary Computation.

[26]  Stephen McCamant,et al.  Differential Slicing: Identifying Causal Execution Differences for Security Applications , 2011, 2011 IEEE Symposium on Security and Privacy.

[27]  David Brumley,et al.  BAP: A Binary Analysis Platform , 2011, CAV.

[28]  Jonathon T. Giffin,et al.  2011 IEEE Symposium on Security and Privacy Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection , 2022 .

[29]  John A. Clark,et al.  The GISMOE challenge: constructing the pareto program surface using genetic programming to find better programs (keynote paper) , 2012, 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering.

[30]  Xiangyu Zhang,et al.  PIEtrace: Platform independent executable trace , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[31]  David Brumley,et al.  Native x86 Decompilation Using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring , 2013, USENIX Security Symposium.

[32]  Rajeev Barua,et al.  A compiler-level intermediate representation based binary analysis and rewriting system , 2013, EuroSys '13.

[33]  Mark Harman,et al.  Genetic programming for Reverse Engineering , 2013, 2013 20th Working Conference on Reverse Engineering (WCRE).

[34]  Claire Le Goues,et al.  Current challenges in automatic software repair , 2013, Software Quality Journal.

[35]  Xiangyu Zhang,et al.  Obfuscation resilient binary code reuse through trace-oriented programming , 2013, CCS.

[36]  Mark Harman,et al.  Using Genetic Improvement and Code Transplants to Specialise a C++ Program to a Problem Class , 2014, EuroGP.

[37]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[38]  Michael G. Epitropakis,et al.  Gen-O-Fix: An embeddable framework for Dynamic Adaptive Genetic Improvement Programming , 2014 .

[39]  Xiangyu Zhang,et al.  Reuse-oriented reverse engineering of functional components from x86 binaries , 2014, ICSE.

[40]  Mark Harman,et al.  Automated software transplantation , 2015, ISSTA.

[41]  Mark Harman,et al.  Ieee Transactions on Evolutionary Computation 1 , 2022 .

[42]  Dinghao Wu,et al.  Reassembleable Disassembling , 2015, USENIX Security Symposium.

[43]  Lida Xu,et al.  The internet of things: a survey , 2014, Information Systems Frontiers.

[44]  Xiangyu Zhang,et al.  Dual Execution for On the Fly Fine Grained Execution Comparison , 2015, ASPLOS.

[45]  M. Christiansen Bypassing Malware Defenses , 2019 .