Journaling of journal is (almost) free

Lightweight databases and key-value stores manage the consistency and reliability of their own data, often through rollback-recovery journaling or write-ahead logging. They further rely on file system journaling to protect the file system structure and metadata. Such journaling of journal appears to violate the classic end-to-end argument for optimal database design. In practice, we observe a significant cost (up to 73% slowdown) by adding the Ext4 file system journaling to the SQLite database on a Google Nexus 7 tablet running a Ubuntu Linux installation. The cost of file system journaling is up to 58% on a conventional machine with an Intel 311 SSD. In this paper, we argue that such cost is largely due to implementation limitations of the existing system. We apply two simple techniques--ensuring a single I/O operation on the synchronous commit path, and adaptively allowing each file to have a custom journaling mode (in particular, whether to journal the file data in addition to the metadata). Compared to SQLite without file system journaling, our enhanced journaling improves the performance or incurs minor (<6%) slowdown on all but one of our 24 test cases (with 14% slowdown in the exceptional case). On average, our enhanced journaling implementation improves the SQLite performance by 7%.

[1]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[2]  Jinpeng Wei,et al.  Software Persistent Memory , 2012, USENIX Annual Technical Conference.

[3]  Terence Kelly,et al.  Failure-atomic msync(): a simple and efficient mechanism for preserving the integrity of durable data , 2013, EuroSys '13.

[4]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[5]  M. Polte,et al.  Comparing performance of solid state devices and mechanical disks , 2008, 2008 3rd Petascale Data Storage Workshop.

[6]  Michael Stonebraker,et al.  Operating system support for database management , 1981, CACM.

[7]  Donald E. Porter,et al.  Operating System Transactions , 2009, SOSP '09.

[8]  Jerome H. Saltzer,et al.  End-to-end arguments in system design , 1984, TOCS.

[9]  StonebrakerMichael Operating system support for database management , 1981 .

[10]  Koji Sato,et al.  The Linux implementation of a log-structured file system , 2006, OPSR.

[11]  Youjip Won,et al.  I/O Stack Optimization for Smartphones , 2013, USENIX Annual Technical Conference.

[12]  Andrea C. Arpaci-Dusseau,et al.  Optimistic crash consistency , 2013, SOSP.

[13]  Cristian Ungureanu,et al.  Revisiting storage for smartphones , 2012, TOS.

[14]  Eric A. Brewer,et al.  Stasis: flexible transactional storage , 2006, OSDI '06.

[15]  Paramvir Bahl,et al.  The Case for VM-Based Cloudlets in Mobile Computing , 2009, IEEE Pervasive Computing.

[16]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[17]  Andrea C. Arpaci-Dusseau,et al.  Analysis and Evolution of Journaling File Systems , 2005, USENIX Annual Technical Conference, General Track.

[18]  Stergios V. Anastasiadis Okeanos: Wasteless Journaling for Fast and Reliable Multistream Storage , 2011, USENIX Annual Technical Conference.