How good are the specs? A study of the bug-finding effectiveness of existing Java API specifications

Runtime verification can be used to find bugs early, during software development, by monitoring test executions against formal specifications (specs). The quality of runtime verification depends on the quality of the specs. While previous research has produced many specs for the Java API, manually or through automatic mining, there has been no large-scale study of their bug-finding effectiveness. We present the first in-depth study of the bug-finding effectiveness of previously proposed specs. We used JavaMOP to monitor 182 manually written and 17 automatically mined specs against more than 18K manually written and 2.1M automatically generated tests in 200 open-source projects. The average runtime overhead was under 4.3x. We inspected 652 violations of manually written specs and (randomly sampled) 200 violations of automatically mined specs. We reported 95 bugs, out of which developers already fixed 74. However, most violations, 82.81% of 652 and 97.89% of 200, were false alarms. Our empirical results show that (1) runtime verification technology has matured enough to incur tolerable runtime overhead during testing, and (2) the existing API specifications can find many bugs that developers are willing to fix; however, (3) the false alarm rates are worrisome and suggest that substantial effort needs to be spent on engineering better specs and properly evaluating their effectiveness.

[1]  Thomas R. Gross,et al.  Statically checking API protocol conformance with mined multi-object specifications , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[2]  Thomas R. Gross,et al.  Automatic Generation of Object Usage Specifications from Large Method Traces , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[3]  Grigore Rosu,et al.  Security-policy monitoring and enforcement with JavaMOP , 2012, PLAS '12.

[4]  Matthew B. Dwyer,et al.  Runtime Verification in Context: Can Optimizing Error Detection Improve Fault Diagnosis? , 2010, RV.

[5]  Zhendong Su,et al.  Testing mined specifications , 2012, SIGSOFT FSE.

[6]  Tao Xie,et al.  Inferring Resource Specifications from Natural Language API Documentation , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[7]  Grigore Rosu,et al.  Efficient parametric runtime verification with deterministic string rewriting , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Sebastian Fischmeister,et al.  Reducing Monitoring Overhead by Integrating Event- and Time-Triggered Techniques , 2013, RV.

[9]  Tao Xie,et al.  Iterative mining of resource-releasing specifications , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[10]  Hridesh Rajan,et al.  Mining preconditions of APIs in large-scale code corpus , 2014, FSE 2014.

[11]  Ondrej Lhoták,et al.  Collaborative Runtime Verification with Tracematches , 2010, J. Log. Comput..

[12]  Grigore Rosu,et al.  JavaMOP: Efficient parametric runtime monitoring framework , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[13]  Matthew B. Dwyer,et al.  Optimizing monitoring of finite state properties through monitor compaction , 2013, ISSTA.

[14]  Mira Mezini,et al.  Ieee Transactions on Software Engineering 1 Automated Api Property Inference Techniques , 2022 .

[15]  Andreas Zeller,et al.  Generating test cases for specification mining , 2010, ISSTA '10.

[16]  Caroline Lemieux Mining Temporal Properties of Data Invariants , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[17]  A. Winsor Sampling techniques. , 2000, Nursing times.

[18]  Grigore Rosu,et al.  Towards Monitoring-Oriented Programming: A Paradigm Combining Specification and Implementation , 2003, RV@CAV.

[19]  Grigore Rosu,et al.  Efficient monitoring of parametric context-free patterns , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[20]  Ondrej Lhoták,et al.  Adding trace matching with free variables to AspectJ , 2005, OOPSLA '05.

[21]  Xun Li,et al.  Mining Universal Specification Based on Probabilistic Model , 2015, SEKE.

[22]  Gary T. Leavens,et al.  @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[23]  Grigore Rosu,et al.  Evolution-Aware Monitoring-Oriented Programming , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[24]  Howard Barringer,et al.  A pattern-based approach to parametric specification mining , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[25]  Siau-Cheng Khoo,et al.  Extracting Significant Specifications from Mining through Mutation Testing , 2011, ICFEM.

[26]  Jun Sun,et al.  TLV: abstraction through testing, learning, and validation , 2015, ESEC/SIGSOFT FSE.

[27]  Eric Bodden,et al.  Finding programming errors earlier by evaluating runtime monitors ahead-of-time , 2008, SIGSOFT '08/FSE-16.

[28]  Andreas Zeller,et al.  Mining temporal specifications from object usage , 2011, Automated Software Engineering.

[29]  Ivan Beschastnikh,et al.  General LTL Specification Mining (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[30]  Shuvendu K. Lahiri,et al.  Finding errors in .net with feedback-directed random testing , 2008, ISSTA '08.

[31]  Yi Zhang,et al.  RV-Monitor: Efficient Parametric Runtime Verification with Simultaneous Properties , 2014, RV.

[32]  Hongyang Qu,et al.  Incremental Runtime Verification of Probabilistic Systems , 2012, RV.

[33]  Grigore Rosu,et al.  Garbage collection for monitoring parametric properties , 2011, PLDI '11.

[34]  Yuriy Brun,et al.  Automatic mining of specifications from invocation traces and method invariants , 2014, SIGSOFT FSE.

[35]  Grigore Rosu,et al.  Scalable Parametric Runtime Monitoring , 2012 .

[36]  Eric Bodden MOPBox: A Library Approach to Runtime Verification - (Tool Demonstration) , 2011, RV.

[37]  Thomas R. Gross,et al.  Leveraging test generation and specification mining for automated bug detection without false positives , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[38]  Michael D. Ernst,et al.  Randoop: feedback-directed random testing for Java , 2007, OOPSLA '07.

[39]  Aditya V. Nori,et al.  Probabilistic, modular and scalable inference of typestate specifications , 2011, PLDI '11.

[40]  General LTL Specification Mining , 2015 .

[41]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[42]  Gordon Fraser,et al.  Do Automatically Generated Unit Tests Find Real Faults? An Empirical Study of Effectiveness and Challenges (T) , 2015, 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[43]  Ondrej Lhoták,et al.  A Staged Static Program Analysis to Improve the Performance of Runtime Monitoring , 2007, ECOOP.

[44]  Claire Le Goues,et al.  Specification Mining with Few False Positives , 2009, TACAS.

[45]  Grigore Rosu,et al.  Mining parametric specifications , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[46]  Sebastian Fischmeister,et al.  Efficient Techniques for Near-Optimal Instrumentation in Time-Triggered Runtime Verification , 2011, RV.

[47]  Thomas R. Gross,et al.  A framework for the evaluation of specification miners based on finite state machines , 2010, 2010 IEEE International Conference on Software Maintenance.

[48]  Choonghwan Lee,et al.  Towards Categorizing and Formalizing the JDK API , 2012 .

[49]  Zhendong Su,et al.  Online inference and enforcement of temporal properties , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[50]  Amer Diwan,et al.  The DaCapo benchmarks: java benchmarking development and analysis , 2006, OOPSLA '06.

[51]  Michael Pradel Dynamically inferring, refining, and checking API usage protocols , 2009, OOPSLA Companion.

[52]  Murat Karaorman,et al.  jMonitor: Java Runtime Event Specification and Monitoring Library , 2005, Electron. Notes Theor. Comput. Sci..