Interface Compliance of Inline Assembly: Automatically Check, Patch and Refine

Inline assembly is still a common practice in low-level C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax (supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assembly codes as blackboxes and relies only on their interface to correctly glue them into the compiled C code. Therefore, the adequacy between the assembly chunk and its interface (named compliance) is of primary importance, as such compliance issues can lead to subtle and hard-to-find bugs. We propose RUSTInA, the first automated technique for formally checking inline assembly compliance, with the extra ability to propose (proven) patches and (optimization) refinements in certain cases. RUSTInA is based on an original formalization of the inline assembly compliance problem together with novel dedicated algorithms. Our prototype has been evaluated on 202 Debian packages with inline assembly (2656 chunks), finding 2183 issues in 85 packages – 986 significant issues in 54 packages (including major projects such as ffmpeg or ALSA), and proposing patches for 92% of them. Currently, 38 patches have already been accepted (solving 156 significant issues), with positive feedback from development teams.

[1]  Minkyu Jung,et al.  Testing intermediate representations for binary analysis , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[2]  Wolfram Schulte,et al.  Vx86: x86 Assembler Simulated in C Powered by Automated Theorem Proving , 2008, AMAST.

[3]  Marie-Laure Potet,et al.  Finding the needle in the heap: combining static analysis and dynamic symbolic execution to trigger use-after-free , 2016, SSPREW '16.

[4]  Adel Djoudi,et al.  Recovering High-Level Conditions from Binary Programs , 2016, FM.

[5]  Marie-Laure Potet,et al.  Get Rid of Inline Assembly through Verification-Oriented Lifting , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[6]  Christoph Treude,et al.  Measuring API documentation on the web , 2011, Web2SE '11.

[7]  L. Mounier,et al.  RUSTInA: Automatically Checking and Patching Inline Assembly Interface Compliance (Artifact Evaluation): Accepted submission #992 – “Interface Compliance of Inline Assembly: Automatically Check, Patch and Refine” , 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion).

[8]  Aurélien Francillon,et al.  Inception: System-Wide Security Testing of Real-World Embedded Systems Software , 2018, USENIX Security Symposium.

[9]  Andreas Podelski,et al.  Verification of Hypervisor Subroutines written in Assembler , 2011 .

[10]  Adel Djoudi,et al.  BINSEC: Binary Code Analysis with Low-Level Regions , 2015, TACAS.

[11]  Hanspeter Mössenböck,et al.  An Analysis of x86-64 Inline Assembly in C Programs , 2018, VEE.

[12]  Gary A. Kildall,et al.  A unified approach to global program optimization , 1973, POPL.

[13]  Olivier Ly,et al.  The BINCOA Framework for Binary Code Analysis , 2011, CAV.

[14]  Dmitry Kravchenko,et al.  Alternating Control Flow Reconstruction , 2012, VMCAI.

[15]  Nikolai Kosmatov,et al.  Frama-C: A software analysis perspective , 2015, Formal Aspects of Computing.

[16]  Sabine Schmaltz,et al.  Integrated Semantics of Intermediate-Language C and Macro-Assembler for Pervasive Formal Verification of Operating Systems and Hypervisors from VerisoftXT , 2012, VSTTE.

[17]  Jean-Yves Marion,et al.  BINSEC/SE: A Dynamic Symbolic Execution Toolkit for Binary-Level Analysis , 2016, 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[18]  Philippe Herrmann,et al.  Refinement-Based CFG Reconstruction from Unstructured Programs , 2011, VMCAI.

[19]  Thomas W. Reps,et al.  WYSINWYX: What you see is not what you eXecute , 2005, TOPL.

[20]  Ralf Huuck,et al.  Some Assembly Required - Program Analysis of Embedded System Code , 2008, 2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation.

[21]  Cristina V. Lopes,et al.  From Query to Usable Code: An Analysis of Stack Overflow Code Snippets , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[22]  Tamara Rezk,et al.  Binsec/Rel: Efficient Relational Symbolic Execution for Constant-Time at Binary-Level , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[23]  David Brumley,et al.  BAP: A Binary Analysis Platform , 2011, CAV.

[24]  Jean-Yves Marion,et al.  Backward-Bounded DSE: Targeting Infeasibility Questions on Obfuscated Codes , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[25]  Jean-Yves Marion,et al.  Specification of concretization and symbolization policies in symbolic execution , 2016, ISSTA.