Learning how to listen: Automatically finding bug patterns in event-driven JavaScript APIs

Event-driven programming is widely practiced in the JavaScript community, both on the client side to handle UI events and AJAX requests, and on the server side to accommodate long-running operations such as file or network I/O. Many popular event-based APIs allow event names to be specified as free-form strings without any validation, potentially leading to lost events for which no listener has been registered and dead listeners for events that are never emitted. In previous work, Madsen et al. presented a precise static analysis for detecting such problems, but their analysis does not scale because it may require a number of contexts that is exponential in the size of the program. Concentrating on the problem of detecting dead listeners, we present an approach to learn how to correctly use event-based APIs by first mining a large corpus of JavaScript code using a simple static analysis to identify code snippets that register an event listener, and then applying statistical modeling to identify unusual patterns, which often indicate incorrect API usage. From a large-scale evaluation on 127,531 open-source JavaScript code bases, our technique was able to detect 75 incorrect listener-registration patterns, while maintaining a precision of 90.9% and recall of 7.5% over our validation set, demonstrating that a learning-based approach to detecting event-handling bugs is feasible. In an additional experiment, we investigated instances of these patterns in 25 open-source projects, and reported 30 issues to the project maintainers, of which 7 have been confirmed as bugs.

[1]  J. G. Bryan,et al.  Introduction to probability and random variables , 1961 .

[2]  J. G. Bryan,et al.  Introduction to probability and random variables , 1961 .

[3]  Andy Chou,et al.  Bugs as Inconsistent Behavior: A General Approach to Inferring Errors in Systems Code. , 2001, SOSP 2001.

[4]  Zhenmin Li,et al.  PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code , 2005, ESEC/FSE-13.

[5]  George C. Necula,et al.  Mining Temporal Specifications for Error Detection , 2005, TACAS.

[6]  Jian Pei,et al.  Mining API patterns as partial orders from source code: from usage scenarios to specifications , 2007, ESEC-FSE '07.

[7]  Andreas Zeller,et al.  Learning from 6,000 projects: lightweight cross-project anomaly detection , 2010, ISSTA '10.

[8]  Lambert M. Surhone,et al.  Node.js , 2010 .

[9]  Xiangyu Zhang,et al.  Statically locating web application bugs caused by asynchronous calls , 2011, WWW.

[10]  Manu Sridharan,et al.  Race detection for web applications , 2012, PLDI.

[11]  Manu Sridharan,et al.  Effective race detection for event-driven programs , 2013, OOPSLA.

[12]  Mira Mezini,et al.  Detecting missing method calls as violations of the majority rule , 2013, TSEM.

[13]  Sukyoung Ryu,et al.  SAFEWAPI: web API misuse detector for web applications , 2014, SIGSOFT FSE.

[14]  Koushik Sen,et al.  TypeDevil: Dynamic Type Inconsistency Analysis for JavaScript , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[15]  Benjamin Livshits,et al.  Detecting JavaScript races that matter , 2015, ESEC/SIGSOFT FSE.

[16]  Frank Tip,et al.  Static analysis of event-driven Node.js JavaScript applications , 2015, OOPSLA.

[17]  Ciera Jaspan,et al.  Tricorder: Building a Program Analysis Ecosystem , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[18]  Andreas Krause,et al.  Predicting Program Properties from "Big Code" , 2015, POPL.

[19]  Shi-Min Hu,et al.  PF-Miner: A practical paired functions mining method for Android kernel in error paths , 2016, J. Syst. Softw..

[20]  Ali Mesbah,et al.  Discovering bug patterns in JavaScript , 2016, SIGSOFT FSE.

[21]  Michael Peyton Jones,et al.  QL: Object-oriented Queries on Relational Data , 2016, ECOOP.

[22]  Swarat Chaudhuri,et al.  Bayesian specification learning for finding API usage errors , 2017, ESEC/SIGSOFT FSE.

[23]  Chao Wang,et al.  RClassify: Classifying Race Conditions in Web Applications via Deterministic Replay , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[24]  Jie Wang,et al.  A comprehensive study on real world concurrency bugs in Node.js , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Zheng Gao,et al.  To Type or Not to Type: Quantifying Detectable Bugs in JavaScript , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[26]  Frank Tip,et al.  Practical initialization race detection for JavaScript web applications , 2017, Proc. ACM Program. Lang..

[27]  Ciera Jaspan,et al.  Lessons from building static analysis tools at Google , 2018, Commun. ACM.

[28]  Anders Møller,et al.  Type Regression Testing to Detect Breaking Changes in Node.js Libraries , 2018, ECOOP.

[29]  Koushik Sen,et al.  DeepBugs: a learning approach to name-based bug detection , 2018, Proc. ACM Program. Lang..

[30]  Frank Tip,et al.  Finding broken promises in asynchronous JavaScript programs , 2018, Proc. ACM Program. Lang..

[31]  Frank Tip,et al.  Practical AJAX race detection for JavaScript web applications , 2018, ESEC/SIGSOFT FSE.

[32]  Rudolf Ferenc,et al.  BugsJS: a Benchmark of JavaScript Bugs , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[33]  Bipin Joshi jQuery , 2019, Beginning Database Programming Using ASP.NET Core 3.

[34]  Martin T. Vechev,et al.  Scalable taint specification inference with big code , 2019, PLDI.

[35]  Jan Eberhardt,et al.  Unsupervised learning of API aliasing specifications , 2019, PLDI.

[36]  Sukyoung Ryu,et al.  Toward Analysis and Bug Finding in JavaScript Web Applications in the Wild , 2019, IEEE Software.

[37]  Thomas J. Sargent,et al.  SciPy , 2020, Learning Scientific Programming with Python.