Practical Analysis of the Dynamic Characteristics of JavaScript

JavaScript is a dynamic object-oriented programming language, which is designed with flexible programming mechanisms. JavaScript is widely used in developing sophisticated software systems, especially web applications. Despite of its popularity, there is a lack of software tools that support JavaScript for software engineering clients. Dataflow analysis approximates software behavior by analyzing the program code; it is the foundation for many software tools. However, several unique features of JavaScript render existing dataflow analysis techniques ineffective. Reflective constructs, generating code at runtime, make it difficult to acquire the complete program at compile time. Dynamic typing, resulting in changes in object behavior, poses a challenge for building accurate models of objects. Different functionalities can be observed when a function is variadic; the variance of the function behavior may be caused by the arguments whose values can only be known at runtime. Object constructors may be polymorphic such that objects created by the same constructor may contain different properties. In addition to object-oriented programming, JavaScript supports paradigms of functional and procedural programming; this feature renders dataflow analysis techniques ineffective when a JavaScript application uses multiple paradigms. Dataflow analysis needs to handle these challenges. In this work, we present an analysis framework and several dataflow analyses that can handle dynamic features in JavaScript. The first contribution of our work is the design and instantiation of the JavaScript Blended Analysis Framework (JSBAF). This generalpurpose and flexible framework judiciously combines dynamic and static analyses. We have implemented an instance of JSBAF, blended taint analysis, to demonstrate the practicality of the framework. Our second contribution is an novel context-sensitive points-to analysis for JavaScript that accurately models object property changes. This algorithm uses a new program representation that enables partial flow-sensitive analysis, a more accurate object representation, and an expanded points-to graph. We have defined parameterized state sensitivity (i.e., k-state sensitivity) and evaluated the effectiveness of 1-state-sensitive analysis as the static phase of JSBAF. The third contribution of our work is an adaptive context-sensitive analysis that selectively applies context-sensitive analysis on the function level. This two-staged adaptive analysis extracts function characteristics from an inexpensive points-to analysis and uses learningbased heuristics to decide on an appropriate context-sensitive analysis per function. The experimental results show that the adaptive analysis is more precise than any single contextsensitive analysis for several programs in the benchmarks, especially for those multi-paradigm programs. The research in this thesis was supported by National Science Foundation CCF-0811518 and IBM Open Collaborative Research Program.

[1]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[2]  Magnus Madsen,et al.  Modeling the HTML DOM and browser API in static analysis of JavaScript web applications , 2011, ESEC/FSE '11.

[3]  Olin Shivers,et al.  Control-flow analysis of higher-order languages of taming lambda , 1991 .

[4]  Frank Tip,et al.  A framework for automated testing of javascript web applications , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[5]  Manu Sridharan,et al.  DLint: dynamically checking bad coding practices in JavaScript , 2015, ISSTA.

[6]  Pietro Ferrara,et al.  Hybrid security analysis of web JavaScript code via dynamic partial evaluation , 2014, ISSTA 2014.

[7]  Jan Vitek,et al.  Automated construction of JavaScript benchmarks , 2011, OOPSLA '11.

[8]  Ben Hardekopf,et al.  JSAI: a static analysis platform for JavaScript , 2014, SIGSOFT FSE.

[9]  Barbara G. Ryder,et al.  A scalable technique for characterizing the usage of temporaries in framework-intensive Java applications , 2008, SIGSOFT '08/FSE-16.

[10]  Dorothy E. Denning,et al.  A lattice model of secure information flow , 1976, CACM.

[11]  Viktor Kuncak,et al.  Runtime Instrumentation for Precise Flow-Sensitive Type Analysis , 2010, RV.

[12]  Joe Gibbs Politz,et al.  TeJaS: retrofitting type systems for JavaScript , 2013, DLS '13.

[13]  Sukyoung Ryu,et al.  SAFE: Formal Specification and Implementation of a Scalable Analysis Framework for ECMAScript , 2012 .

[14]  Jan Kofron,et al.  Framework for Static Analysis of PHP Applications , 2015, ECOOP.

[15]  Barbara G. Ryder Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages , 2003, CC.

[16]  Salvatore Guarnieri GULFSTREAM: Staged Static Analysis for Streaming JavaScript Applications , 2010, WebApps.

[17]  Peter Thiemann,et al.  Type Analysis for JavaScript , 2009, SAS.

[18]  Thomas W. Reps,et al.  Recency-Abstraction for Heap-Allocated Storage , 2006, SAS.

[19]  Ondrej Lhoták,et al.  Pick your contexts well: understanding object-sensitivity , 2011, POPL '11.

[20]  Bruno Blanchet,et al.  Escape analysis for object-oriented languages: application to Java , 1999, OOPSLA '99.

[21]  Benjamin Livshits,et al.  JSMeter: Comparing the Behavior of JavaScript Benchmarks with Real Web Applications , 2010, WebApps.

[22]  Sorin Lerner,et al.  Staged information flow for javascript , 2009, PLDI '09.

[23]  Deepak D'Souza,et al.  Scalable Flow-Sensitive Pointer Analysis for Java with Strong Updates , 2012, ECOOP.

[24]  Manu Sridharan,et al.  Refinement-based context-sensitive points-to analysis for Java , 2006, PLDI '06.

[25]  Ondrej Lhoták,et al.  In defense of soundiness , 2015, Commun. ACM.

[26]  Ravi Chugh,et al.  Dependent types for JavaScript , 2012, OOPSLA '12.

[27]  Ali Mesbah,et al.  Hybrid DOM-Sensitive Change Impact Analysis for JavaScript , 2015, ECOOP.

[28]  Christopher Krügel,et al.  Cross Site Scripting Prevention with Dynamic Data Tainting and Static Analysis , 2007, NDSS.

[29]  Koushik Sen,et al.  Jalangi: a selective record-replay and dynamic analysis framework for JavaScript , 2013, ESEC/FSE 2013.

[30]  Barbara G. Ryder,et al.  Blended analysis for performance understanding of framework-based applications , 2007, ISSTA '07.

[31]  Ali Mesbah,et al.  An Empirical Study of Client-Side JavaScript Bugs , 2013, 2013 ACM / IEEE International Symposium on Empirical Software Engineering and Measurement.

[32]  Karthik Pattabiraman,et al.  JavaScript Errors in the Wild: An Empirical Study , 2011, 2011 IEEE 22nd International Symposium on Software Reliability Engineering.

[33]  Barbara G. Ryder,et al.  Adaptive Context-sensitive Analysis for JavaScript , 2015, ECOOP.

[34]  Koushik Sen,et al.  The Good, the Bad, and the Ugly: An Empirical Study of Implicit Type Conversions in JavaScript , 2015, ECOOP.

[35]  Henry Lieberman,et al.  Using Prototypical Objects to Implement Shared Behavior in Object Oriented Systems , 1986, OOPSLA.

[36]  Barbara G. Ryder,et al.  Properties of data flow frameworks , 1990, Acta Informatica.

[37]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .

[38]  Ondrej Lhoták,et al.  Evaluating the benefits of context-sensitive points-to analysis using a BDD-based implementation , 2008, TSEM.

[39]  Ravi Sethi Programming languages - concepts and constructs (2. ed.) , 1996 .

[40]  Dawn Xiaodong Song,et al.  Cross-Origin JavaScript Capability Leaks: Detection, Exploitation, and Defense , 2009, USENIX Security Symposium.

[41]  Steve Hanna,et al.  A Symbolic Execution Framework for JavaScript , 2010, 2010 IEEE Symposium on Security and Privacy.

[42]  Marco Pistoia,et al.  Saving the world wide web from vulnerable JavaScript , 2011, ISSTA '11.

[43]  Barbara G. Ryder,et al.  State-Sensitive Points-to Analysis for the Dynamic Behavior of JavaScript Objects , 2014, ECOOP.

[44]  Benjamin Livshits,et al.  GATEKEEPER: Mostly Static Enforcement of Security and Reliability Policies for JavaScript Code , 2009, USENIX Security Symposium.

[45]  Esben Andreasen,et al.  Determinacy in static analysis for jQuery , 2014, OOPSLA 2014.

[46]  Peter Wegner,et al.  Dimensions of object-based language design , 1987, OOPSLA '87.

[47]  Christopher Krügel,et al.  Pixy: a static analysis tool for detecting Web application vulnerabilities , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[48]  James Harland,et al.  Evaluating the dynamic behaviour of Python applications , 2009, ACSC.

[49]  Yannis Smaragdakis,et al.  Pointer Analysis , 2015, Found. Trends Program. Lang..

[50]  Jan Vitek,et al.  An analysis of the dynamic behavior of JavaScript programs , 2010, PLDI '10.

[51]  Haining Wang,et al.  A measurement study of insecure javascript practices on the web , 2013, TWEB.

[52]  Haining Wang,et al.  Characterizing insecure javascript practices on the web , 2009, WWW '09.

[53]  Ondrej Lhoták,et al.  Context-Sensitive Points-to Analysis: Is It Worth It? , 2006, CC.

[54]  Yannis Smaragdakis,et al.  Introspective analysis: context-sensitivity, across the board , 2014, PLDI.

[55]  Ali Mesbah,et al.  JSEFT: Automated Javascript Unit Test Generation , 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST).

[56]  Ole Agesen The Cartesian Product Algorithm: Simple and Precise Type Inference Of Parametric Polymorphism , 1995, ECOOP.

[57]  Benjamin Livshits,et al.  Practical static analysis of JavaScript applications in the presence of frameworks and libraries , 2013, ESEC/FSE 2013.

[58]  Jeffrey S. Foster,et al.  Profile-guided static typing for dynamic scripting languages , 2009, OOPSLA 2009.

[59]  Sukyoung Ryu,et al.  Scalable and Precise Static Analysis of JavaScript Applications via Loop-Sensitivity , 2015, ECOOP.

[60]  Calvin Lin,et al.  Client-Driven Pointer Analysis , 2003, SAS.

[61]  Barbara G. Ryder,et al.  Practical blended taint analysis for JavaScript , 2013, ISSTA.

[62]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[63]  David Flanagan,et al.  JavaScript: The Definitive Guide , 1996 .

[64]  Barbara G. Ryder,et al.  Parameterized object sensitivity for points-to analysis for Java , 2005, TSEM.

[65]  Jan Vitek,et al.  The Eval That Men Do - A Large-Scale Study of the Use of Eval in JavaScript Applications , 2011, ECOOP.

[66]  Mayur Naik,et al.  Scaling abstraction refinement via pruning , 2011, PLDI '11.

[67]  Peter Thiemann,et al.  Recency Types for Analyzing Scripting Languages , 2010, ECOOP.

[68]  Franceska Xhakaj,et al.  Empirical study of the dynamic behavior of JavaScript objects , 2016, Softw. Pract. Exp..

[69]  Jan Vitek,et al.  Eval begone!: semi-automated removal of eval from javascript programs , 2012, OOPSLA '12.

[70]  Simon Holm Jensen,et al.  Remedying the eval that men do , 2012, ISSTA 2012.

[71]  Katsuro Inoue,et al.  Alias analysis for object - oriented programs , 2000 .

[72]  Henry Lieberman,et al.  Using prototypical objects to implement shared behavior in object-oriented systems , 1986, OOPLSA '86.

[73]  Paul Klint,et al.  An empirical study of PHP feature usage: a static analysis perspective , 2013, ISSTA.

[74]  Peter Thiemann,et al.  Interprocedural Analysis with Lazy Propagation , 2010, SAS.

[75]  Hongseok Yang,et al.  Selective context-sensitivity guided by impact pre-analysis , 2014, PLDI.

[76]  Frank Tip,et al.  Efficient construction of approximate call graphs for JavaScript IDE services , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[77]  Michael Hind,et al.  Pointer analysis: haven't we solved this problem yet? , 2001, PASTE '01.

[78]  Frank Tip,et al.  Correlation Tracking for Points-To Analysis of JavaScript , 2012, ECOOP.

[79]  Frank Tip,et al.  Dynamic determinacy analysis , 2013, PLDI.

[80]  Yannis Smaragdakis,et al.  Hybrid context-sensitivity for points-to analysis , 2013, PLDI.