Learning minimal abstractions

Static analyses are generally parametrized by an abstraction which is chosen from a family of abstractions. We are interested in flexible families of abstractions with many parameters, as these families can allow one to increase precision in ways tailored to the client without sacrificing scalability. For example, we consider k-limited points-to analyses where each call site and allocation site in a program can have a different k value. We then ask a natural question in this paper: What is the minimal (coarsest) abstraction in a given family which is able to prove a set of queries? In addressing this question, we make the following two contributions: (i) We introduce two machine learning algorithms for efficiently finding a minimal abstraction; and (ii) for a static race detector backed by a k-limited points-to analysis, we show empirically that minimal abstractions are actually quite coarse: It suffices to provide context/object sensitivity to a very small fraction (0.4-2.3%) of the sites to yield equally precise results as providing context/object sensitivity uniformly to all sites.

[1]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[2]  Olin Shivers,et al.  Control flow analysis in scheme , 1988, PLDI '88.

[3]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[4]  Thomas W. Reps,et al.  Demand Interprocedural Program Analysis Using Logic Databases , 1993, Workshop on Programming with Logic Databases , ILPS.

[5]  Thomas W. Reps,et al.  Solving Demand Versions of Interprocedural Analysis Problems , 1994, CC.

[6]  R. Hamlet RANDOM TESTING , 1994 .

[7]  Andrew A. Chien,et al.  Precise concrete type inference for object-oriented languages , 1994, OOPSLA 1994.

[8]  Hassen Saïdi,et al.  Construction of Abstract State Graphs with PVS , 1997, CAV.

[9]  Reinhard Wilhelm,et al.  Parametric shape analysis via 3-valued logic , 1999, POPL '99.

[10]  Olivier Tardieu,et al.  Demand-driven pointer analysis , 2001, PLDI '01.

[11]  Barbara G. Ryder,et al.  Parameterized object sensitivity for points-to and side-effect analyses for Java , 2002, ISSTA '02.

[12]  Sriram K. Rajamani,et al.  The SLAM project: debugging system software via static analysis , 2002, POPL '02.

[13]  Calvin Lin,et al.  Client-Driven Pointer Analysis , 2003, SAS.

[14]  Monica S. Lam,et al.  Cloning-based context-sensitive pointer alias analysis using binary decision diagrams , 2004, PLDI '04.

[15]  Sumit Gulwani,et al.  Program analysis using random interpretation , 2005 .

[16]  Manu Sridharan,et al.  Demand-driven points-to analysis for Java , 2005, OOPSLA '05.

[17]  Barbara G. Ryder,et al.  Parameterized object sensitivity for points-to analysis for Java , 2005, TSEM.

[18]  SridharanManu,et al.  Demand-driven points-to analysis for Java , 2005 .

[19]  Alexander Aiken,et al.  Effective static race detection for Java , 2006, PLDI '06.

[20]  Manu Sridharan,et al.  Refinement-based context-sensitive points-to analysis for Java , 2006, PLDI '06.

[21]  David L Donoho,et al.  Compressed sensing , 2006, IEEE Transactions on Information Theory.

[22]  Ondrej Lhoták,et al.  Context-Sensitive Points-to Analysis: Is It Worth It? , 2006, CC.

[23]  Monica S. Lam,et al.  Context-sensitive pointer analysis using binary decision diagrams , 2007 .

[24]  H. Robbins A Stochastic Approximation Method , 1951 .

[25]  Xin Zheng,et al.  Demand-driven alias analysis for C , 2008, POPL '08.

[26]  Ondrej Lhoták,et al.  Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation , 2008, TSEM.

[27]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[28]  Todd Millstein,et al.  Automatic predicate abstraction of C programs , 2001, PLDI '01.