The Dangers and Complexities of SQLite Benchmarking

Benchmarking systems in a repeatable fashion is complex and error-prone. The systems community has repeatedly discussed the complexities of benchmarking and how to properly report benchmarking results. Using the example of SQLite, we examine the current state of benchmarking in industry and academia. We show that changing just one parameter in SQLite can change the performance by 11.8X, and that changing multiple parameters can lead up to a 28X difference in performance. We find that these configuration parameters are often not set or reported in academic research, leading to incomplete and misleading evaluations. Running different off-the-shelf SQLite benchmarking tools such as Mobibench and Androbench in their default configuration shows upto 50% difference in performance. We hope this paper incites discussion in the systems community and among SQLite developers. We hope that our detailed analysis will help application developers to choose optimal SQLite parameters to achieve better performance.

[1]  Youjip Won,et al.  Androtrace: framework for tracing and analyzing IOs on Android , 2015, INFLOW '15.

[2]  Eric Senn,et al.  Revisiting read-ahead efficiency for raw NAND flash storage in embedded Linux , 2014, SIGBED.

[3]  Matthias Hauswirth,et al.  Producing wrong data without doing anything obviously wrong! , 2009, ASPLOS.

[4]  Raghunath Othayoth Nambiar,et al.  Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council (TPC) , 2011, TPCTC.

[5]  Youjip Won,et al.  On the IO Characteristics of the SQLite Transactions , 2016, 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[6]  Yan Wang,et al.  Profiling the Responsiveness of Android Applications via Automated Resource Amplification , 2016, 2016 IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft).

[7]  Rachid Guerraoui,et al.  ESTIMA: extrapolating scalability of in-memory applications , 2016, ACM Trans. Parallel Comput..

[8]  Andrea C. Arpaci-Dusseau,et al.  Split-level I/O scheduling , 2015, SOSP.

[9]  Youjip Won,et al.  NVWAL: Exploiting NVRAM in Write-Ahead Logging , 2016, ASPLOS.

[10]  Je-Min Kim,et al.  AndroBench: Benchmarking the Storage Performance of Android-Based Mobile Devices , 2011, ICFCE.

[11]  Stefano Paraboschi,et al.  SeSQLite: Security Enhanced SQLite: Mandatory Access Control for Android databases , 2015, ACSAC 2015.

[12]  Erez Zadok,et al.  Benchmarking File System Benchmarking: It *IS* Rocket Science , 2011, HotOS.

[13]  Cristian Ungureanu,et al.  Revisiting storage for smartphones , 2012, TOS.

[14]  Joo Young Hwang,et al.  F2FS: A New File System for Flash Storage , 2015, FAST.

[15]  Youjip Won,et al.  AndroStep: Android Storage Performance Analysis Tool , 2013, Software Engineering.

[16]  Emery D. Berger,et al.  Coz: finding code that counts with causal profiling , 2015, USENIX Annual Technical Conference.

[17]  Lei Li,et al.  Developing Hands-on Labware for Emerging Database Security , 2016, SIGITE.

[18]  Christof Fetzer,et al.  HAFT: hardware-assisted fault tolerance , 2016, EuroSys.

[19]  Junfeng Yang,et al.  Reducing crash recoverability to reachability , 2016, POPL.

[20]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[21]  Andrea C. Arpaci-Dusseau,et al.  Analysis and Evolution of Journaling File Systems , 2005, USENIX Annual Technical Conference, General Track.

[22]  Michael Vassilakopoulos,et al.  A diet-guide mobile application for diabetes mellitus management , 2015, Panhellenic Conference on Informatics.

[23]  Sang-Won Lee,et al.  SQLite Optimization with Phase Change Memory for Mobile Applications , 2015, Proc. VLDB Endow..

[24]  Phil McMinn,et al.  The Effectiveness of Test Coverage Criteria for Relational Database Schema Integrity Constraints , 2015, ACM Trans. Softw. Eng. Methodol..