Design and implementation of a dependence-based taint analysis

Caused by the misuse of invalidated inputs, the major threats to WEB programs are injection vulnerabilities which could be located by taint analysis tracing the propagation and the usage of input data. On the basis of the formal definition of dependent relationship among object variables and object fields in the intermediate language JIMPLE, an inter-method algorithm is proposed to build a field-sensitive data dependence graph. The dependent relation of the parameters of JIMPLE methods is specially modeled and a reaching matrix is used to traverse all the taint propagation paths. To analyze large scale programs, the analysis is decomposed into multiple stages, each of which completes a sub-task to iteratively traverse the paths. A prototype is implemented on top of SOOT and tested to analyze several WEB sites, and experimental results shows better time performance and no loss of precision compared to existing approaches.