I/O Speculation for the Microsecond Era

Microsecond latencies and access times will soon dominate most datacenter I/O workloads, thanks to improvements in both storage and networking technologies. Current techniques for dealing with I/O latency are targeted for either very fast (nanosecond) or slow (millisecond) devices. These techniques are suboptimal for microsecond devices - they either block the processor for tens of microseconds or yield the processor only to be ready again microseconds later. Speculation is an alternative technique that resolves the issues of yielding and blocking by enabling an application to continue running until the application produces an externally visible side effect. State-of-the-art techniques for speculating on I/O requests involve checkpointing, which can take up to a millisecond, squandering any of the performance benefits microsecond scale devices have to offer. In this paper, we survey how speculation can address the challenges that microsecond scale devices will bring. We measure applications for the potential benefit to be gained from speculation and examine several classes of speculation techniques. In addition, we propose two new techniques, hardware checkpoint and checkpoint-free speculation. Our exploration suggests that speculation will enable systems to extract the maximum performance of I/O devices in the microsecond era.

[1]  Y. Iwata,et al.  Bit Cost Scalable Technology with Punch and Plug Process for Ultra High Density Flash Memory , 2007, 2007 IEEE Symposium on VLSI Technology.

[2]  Young-Hyun Jun,et al.  A new 3-bit programming algorithm using SLC-to-TLC migration for 8MB/s high performance TLC NAND flash memory , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[3]  Steven Swanson,et al.  Providing safe, user space access to fast, solid state disks , 2012, ASPLOS XVII.

[4]  Vijay Janapa Reddi,et al.  PIN: a binary instrumentation tool for computer architecture research and education , 2004, WCAE '04.

[5]  Haibo Chen,et al.  Using restricted transactional memory to build a scalable in-memory database , 2014, EuroSys '14.

[6]  Garth A. Gibson,et al.  Automatic I/O hint generation through speculative execution , 1999, OSDI '99.

[7]  Michael M. Swift,et al.  Revamping the system interface to storage-class memory , 2012 .

[8]  Suparna Bhattacharya,et al.  Asynchronous I/O Support in Linux 2.5 , 2010 .

[9]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[10]  K.J. Nesbit,et al.  AC/DC: an adaptive data cache prefetcher , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[11]  Donald F. Towsley,et al.  Experimental Evaluation of Real-Time Optimistic Concurrency Control Schemes , 1991, VLDB.

[12]  Todd C. Mowry,et al.  Automatic compiler-inserted I/O prefetching for out-of-core applications , 1996, OSDI '96.

[13]  Fay W. Chang,et al.  Operating System I/O Speculation: How Two Invocations Are Faster Than One , 2003, USENIX Annual Technical Conference, General Track.

[14]  Luis A. Lastras,et al.  PreSET: Improving performance of phase change memories by exploiting asymmetry in write times , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[15]  Jason Flinn,et al.  Operating system support for application-specific speculation , 2011, EuroSys '11.

[16]  Frank Hady,et al.  When poll is better than interrupt , 2012, FAST.

[17]  Christopher J. Hughes,et al.  Performance evaluation of Intel® Transactional Synchronization Extensions for high-performance computing , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[18]  Chen Ding,et al.  Fast Track: A Software System for Speculative Program Optimization , 2009, 2009 International Symposium on Code Generation and Optimization.

[19]  Jason Flinn,et al.  Rethink the sync , 2006, OSDI '06.

[20]  James E. Smith,et al.  A study of branch prediction strategies , 1981, ISCA '98.