Field-sensitive program dependence analysis

Statement st transitively depends on statement stseed if the execution of stseed may affect the execution of st. Computing transitive program dependences is a fundamental operation in many automatic software analysis tools. Existing tools find it challenging to compute transitive dependences for programs manipulating large aggregate structure variables, and their limitations adversely affect analysis of certain important classes of software systems, e.g., large-scale enterprise resource planning (ERP) systems. This paper presents an efficient conservative interprocedural static analysis algorithm for computing field-sensitive transitive program dependences in the presence of large aggregate structure variables. Our key insight is that program dependences coming from operations on whole substructures can be precisely (i.e., field-sensitively) represented at the granularity of substructures instead of individual fields. Technically, we adapt the interval domain to concisely record dependences between multiple pairs of fields of aggregate structure variables by exploiting the fields' spatial arrangement. We prove that our algorithm is as precise as any algorithm which works at the granularity of individual fields, the most-precise known approach for this problem. Our empirical study, in which we analyzed industrial ERP programs with over 100,000 lines of code in average, shows significant improvements in both the running times and memory consumption over existing approaches: The baseline is an efficient field-insensitive whole-structure that incurs a 62% false error rate. An atomization-based algorithm, which disassemble every aggregate structure variable into the collection of its individual fields, can remove all these false errors at the cost of doubling the average analysis time, from 30 to 60 minutes. In contrast, our new precise algorithm removes all false errors by increasing the time only to 35 minutes. In terms of memory consumption, our algorithm increases the footprint by less than 10%, compared to 50% overhead of the atomizing algorithm.

[1]  Mark David Weiser,et al.  Program slices: formal, psychological, and practical investigations of an automatic program abstraction method , 1979 .

[2]  Steven S. Muchnick,et al.  Advanced Compiler Design and Implementation , 1997 .

[3]  Thomas W. Reps,et al.  Speeding up slicing , 1994, SIGSOFT '94.

[4]  Susan Horwitz,et al.  Identifying the semantic and textual differences between two versions of a program , 1990, PLDI '90.

[5]  Frank Tip,et al.  A survey of program slicing techniques , 1994, J. Program. Lang..

[6]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[7]  David W. Binkley,et al.  Interprocedural slicing using dependence graphs , 1990, TOPL.

[8]  Susan Horwitz,et al.  Incremental program testing using program dependence graphs , 1993, POPL '93.

[9]  Chris Hankin,et al.  Efficient field-sensitive pointer analysis of C , 2007, TOPL.

[10]  Manu Sridharan,et al.  Thin slicing , 2007, PLDI '07.

[11]  Thomas W. Reps,et al.  Pointer analysis for programs with structures and casting , 1999, PLDI '99.

[12]  Thomas W. Reps,et al.  Integrating non-intering versions of programs , 1988, POPL '88.

[13]  Karl J. Ottenstein,et al.  The program dependence graph in a software development environment , 1984, SDE 1.

[14]  Thomas W. Reps,et al.  Precise interprocedural dataflow analysis via graph reachability , 1995, POPL '95.

[15]  Keith Brian Gallagher,et al.  Using Program Slicing in Software Maintenance , 1991, IEEE Trans. Software Eng..

[16]  Phil Pfeiffer,et al.  Dependence analysis for pointer variables , 1989, PLDI '89.

[17]  Frank Tip,et al.  Aggregate structure identification and its application to program analysis , 1999, POPL '99.

[18]  David Binkley,et al.  Using semantic differencing to reduce the cost of regression testing , 1992, Proceedings Conference on Software Maintenance 1992.

[19]  Thomas W. Reps,et al.  Precise Interprocedural Dataflow Analysis with Applications to Constant Propagation , 1995, TAPSOFT.

[20]  Wojtek Kozaczynski,et al.  Automated support for legacy code understanding , 1994, CACM.

[21]  Thomas W. Reps,et al.  A framework for numeric analysis of array operations , 2005, POPL '05.

[22]  Shmuel Sagiv,et al.  Customization change impact analysis for erp professionals via program slicing , 2008, ISSTA '08.

[23]  Horst Keller,et al.  ABAP Objects: Introduction to Programming SAP Applications , 2002 .

[24]  Thomas W. Reps,et al.  Integrating noninterfering versions of programs , 1989, TOPL.

[25]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[26]  James Robert Lyle Evaluating variations on program slicing for debugging (data-flow, ada) , 1984 .

[27]  Isil Dillig,et al.  Fluid Updates: Beyond Strong vs. Weak Updates , 2010, ESOP.