WYSIWIB: exploiting fine‐grained program structure in a scriptable API‐usage protocol‐finding process

Bug‐finding tools rely on specifications of what is correct or incorrect code. As it is difficult for a tool developer or user to anticipate all possible specifications, strategies for inferring specifications have been proposed. These strategies obtain probable specifications by observing common characteristics of code or execution traces, typically focusing on sequences of function calls. To counter the observed high rate of false positives, heuristics have been proposed for ranking or pruning the results. These heuristics, however, can result in false negatives, especially for rarely used functions. In this paper, we propose an alternate approach to specification inference, in which the user guides the inference process using patterns of code that reflect the user's understanding of the conventions and design of the targeted software project. We focus on specifications describing the correct usage of API functions, which we refer to as API protocols. Our approach builds on the Coccinelle program matching and transformation tool, which allows a user to construct patterns that reflect the structure of the code to be matched. We evaluate our approach on the source code of the Linux kernel, which defines a very large number of API functions with varying properties. Linux is also critical software, implying that fixing even bugs involving rarely used protocols is essential. In our experiments, we use our approach to find over 3000 potential API protocols, with an estimated false positive rate of under 15% and use these protocols to find over 360 bugs in the use of API functions. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  Dawson R. Engler,et al.  Checking system rules using system-specific, programmer-written compiler extensions , 2000, OSDI.

[2]  Dawson R. Engler,et al.  Bugs as deviant behavior: a general approach to inferring errors in systems code , 2001, SOSP.

[3]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[4]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[5]  Dawson R. Engler,et al.  From uncertainty to belief: inferring the specification within , 2006, OSDI '06.

[6]  Manuvir Das,et al.  Perracotta: mining temporal API rules from imperfect traces , 2006, ICSE.

[7]  Sriram K. Rajamani,et al.  Thorough static analysis of device drivers , 2006, EuroSys.

[8]  Julia L. Lawall,et al.  Towards easing the diagnosis of bugs in OS code , 2007, PLOS '07.

[9]  Suresh Jagannathan,et al.  Path-Sensitive Inference of Function Precedence Protocols , 2007, 29th International Conference on Software Engineering (ICSE'07).

[10]  Yuanyuan Zhou,et al.  /*icomment: bugs or bad comments?*/ , 2007, SOSP.

[11]  Henrik Stuart,et al.  Hunting bugs with Coccinelle , 2008 .

[12]  Xiao Ma,et al.  AutoISES: Automatically Inferring Security Specification and Detecting Violations , 2008, USENIX Security Symposium.

[13]  Julia L. Lawall,et al.  Documenting and automating collateral evolutions in linux device drivers , 2008, Eurosys '08.

[14]  David Hovemeyer,et al.  Using Static Analysis to Find Bugs , 2008, IEEE Software.

[15]  Chao Liu,et al.  Mining past-time temporal rules from execution traces , 2008, WODA '08.

[16]  Claire Le Goues,et al.  Specification Mining with Few False Positives , 2009, TACAS.

[17]  Damien Doligez,et al.  A foundation for flow-based program matching: using temporal logic and model checking , 2009, POPL '09.

[18]  Julia L. Lawall,et al.  WYSIWIB: A declarative approach to finding API protocols and bugs in Linux code , 2009, DSN.

[19]  Nicolas Palix,et al.  Clang and Coccinelle: Synergising program analysis tools for CERT C Secure Coding Standard certification , 2010, Electron. Commun. Eur. Assoc. Softw. Sci. Technol..

[20]  Christophe Calvès,et al.  Faults in linux: ten years later , 2011, ASPLOS XVI.