Modeling the HTML DOM and browser API in static analysis of JavaScript web applications

Developers of JavaScript web applications have little tool support for catching errors early in development. In comparison, an abundance of tools exist for statically typed languages, including sophisticated integrated development environments and specialized static analyses. Transferring such technologies to the domain of JavaScript web applications is challenging. In this paper, we discuss the challenges, which include the dynamic aspects of JavaScript and the complex interactions between JavaScript, HTML, and the browser. From this, we present the first static analysis that is capable of reasoning about the flow of control and data in modern JavaScript applications that interact with the HTML DOM and browser API. One application of such a static analysis is to detect type-related and dataflow-related programming errors. We report on experiments with a range of modern web applications, including Chrome Experiments and IE Test Drive applications, to measure the precision and performance of the technique. The experiments indicate that the analysis is able to show absence of errors related to missing object properties and to identify dead and unreachable code. By measuring the precision of the types inferred for object properties, the analysis is precise enough to show that most expressions have unique types. By also producing precise call graphs, the analysis additionally shows that most invocations in the programs are monomorphic. We furthermore study the usefulness of the analysis to detect spelling errors in the code. Despite the encouraging results, not all problems are solved and some of the experiments indicate a potential for improvement, which allows us to identify central remaining challenges and outline directions for future work.

[1]  Sorin Lerner,et al.  Staged information flow for javascript , 2009, PLDI '09.

[2]  Kwang-Moo Choe,et al.  Points-to analysis for JavaScript , 2009, SAC '09.

[3]  Peter Thiemann A Type Safe DOM API , 2005, DBPL.

[4]  Peter Thiemann,et al.  Interprocedural Analysis with Lazy Propagation , 2010, SAS.

[5]  Sophia Drossopoulou,et al.  Towards Type Inference for JavaScript , 2005, ECOOP.

[6]  Shriram Krishnamurthi,et al.  Typing Local Control and State Using Flow Analysis , 2011, ESOP.

[7]  Salvatore Guarnieri GULFSTREAM: Staged Static Analysis for Streaming JavaScript Applications , 2010, WebApps.

[8]  Peter Thiemann,et al.  Type Analysis for JavaScript , 2009, SAS.

[9]  Jan Vitek,et al.  An analysis of the dynamic behavior of JavaScript programs , 2010, PLDI '10.

[10]  Jeffrey D. Ullman,et al.  Monotone data flow analysis frameworks , 1977, Acta Informatica.

[11]  Shriram Krishnamurthi,et al.  Using static analysis for Ajax intrusion detection , 2009, WWW '09.

[12]  Ankur Taly,et al.  An Operational Semantics for JavaScript , 2008, APLAS.

[13]  Thomas W. Reps,et al.  Recency-Abstraction for Heap-Allocated Storage , 2006, SAS.

[14]  Francesco Logozzo,et al.  RATA: Rapid Atomic Type Analysis by Abstract Interpretation - Application to JavaScript Optimization , 2010, CC.

[15]  Jan Vitek,et al.  The Eval That Men Do - A Large-Scale Study of the Use of Eval in JavaScript Applications , 2011, ECOOP.

[16]  Benjamin Livshits,et al.  GATEKEEPER: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code , 2009, USENIX Security Symposium.

[17]  Peter Thiemann Towards a Type System for Analyzing JavaScript Programs , 2005, ESOP.