Detecting Behaviorally Equivalent Functions via Symbolic Execution

Software bugs are a reality of programming. They can be di cult to identify and resolve, even for the most experienced programmers. Certain bugs may even be impossible to remove because they provide some desired functionality. For this reason, designers of modern security-critical applications must accept the inevitable existence of bugs and find ways to detect and recover from the errors they cause. One approach to error detection involves running multiple implementations of a single program at the same time, on the same input, and comparing the results. Divergence of the behavior of the di↵erent implementations indicates the existence of a bug. The question we consider in this paper is how to construct these diverse implementations of security-critical programs in a cost-e↵ective way. The solution we propose is to first find existing diverse function implementations and then use these function implementations as building blocks for diverse program implementations. To find diverse function implementations, we use a technique we call adaptor synthesis to compare arbitrary functions for behavioral equivalence. To account for di↵erences in input argument structure between arbitrary functions we allow for adaptor functions, or adaptors, that convert from one argument structure to another. Using adaptors, the problem of determining whether two arbitrary functions are behaviorally equivalent becomes the problem of synthesizing an adaptor between the two functions that makes their output equivalent on all inputs in a specified domain. Along with presenting our adaptor synthesis technique, we describe an implementation for comparing functions for behavioral equivalence at the binary level on the Linux x86-64 platform using a family of adaptors that allows arithmetic combinations of integer values.

[1]  Mark S. Boddy,et al.  Frankencode: Creating Diverse Programs Using Code Clones , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[2]  David Evans,et al.  N-Variant Systems: A Secretless Framework for Security through Diversity , 2006, USENIX Security Symposium.

[3]  Zhendong Su,et al.  Scalable detection of semantic clones , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[4]  Henry S. Warren,et al.  Hacker's Delight , 2002 .

[5]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[6]  Zohar Manna,et al.  A Deductive Approach to Program Synthesis , 1979, TOPL.

[7]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[8]  Shinji Kusumoto,et al.  CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code , 2002, IEEE Trans. Software Eng..

[9]  Eric Totel,et al.  COTS Diversity Based Intrusion Detection and Application to Web Servers , 2005, RAID.

[10]  Sanjit A. Seshia,et al.  Combinatorial sketching for finite programs , 2006, ASPLOS XII.

[11]  Stephen McCamant,et al.  Finding Semantically-Equivalent Binary Code By Synthesizing Adaptors , 2017, ArXiv.

[12]  Dawson R. Engler,et al.  Practical, Low-Effort Equivalence Verification of Real Code , 2011, CAV.

[13]  Les Hatton,et al.  N-Version Design vs. One Good Version , 1997, IEEE Softw..

[14]  Emery D. Berger,et al.  DieHard: probabilistic memory safety for unsafe languages , 2006, PLDI '06.

[15]  Henry Massalin Superoptimizer: a look at the smallest program , 1987, ASPLOS 1987.

[16]  Stephen McCamant,et al.  Path-exploration lifting: hi-fi tests for lo-fi emulators , 2012, ASPLOS XVII.

[17]  Zhendong Su,et al.  DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones , 2007, 29th International Conference on Software Engineering (ICSE'07).

[18]  Keith H. Randall,et al.  Denali: a goal-directed superoptimizer , 2002, PLDI '02.

[19]  Vern Paxson,et al.  The Matter of Heartbleed , 2014, Internet Measurement Conference.

[20]  Benoit Baudry,et al.  The Multiple Facets of Software Diversity , 2014, ACM Comput. Surv..

[21]  Yuanyuan Zhou,et al.  CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code , 2004, OSDI.

[22]  Per Larsen,et al.  SoK: Automated Software Diversity , 2014, 2014 IEEE Symposium on Security and Privacy.

[23]  Michael Franz,et al.  Runtime Defense against Code Injection Attacks Using Replicated Execution , 2011, IEEE Transactions on Dependable and Secure Computing.

[24]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[25]  Zhendong Su,et al.  Automatic mining of functionally equivalent code fragments via random testing , 2009, ISSTA.

[26]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[27]  Dave E. Eckhardt,et al.  A Theoretical Basis for the Analysis of Multiversion Software Subject to Coincident Errors , 1985, IEEE Transactions on Software Engineering.