Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories

Emerging non-volatile memory technologies such as phase change memory (PCM) promise to increase storage system performance by a wide margin relative to both conventional disks and flash-based SSDs. Realizing this potential will require significant changes to the way systems interact with storage devices as well as a rethinking of the storage devices themselves. This paper describes the architecture of a prototype PCIe-attached storage array built from emulated PCM storage called Moneta. Moneta provides a carefully designed hardware/software interface that makes issuing and completing accesses atomic. The atomic management interface, combined with hardware scheduling optimizations, and an optimized storage stack increases performance for small, random accesses by 18x and reduces software overheads by 60%. Moneta array sustain 2.8~GB/s for sequential transfers and 541K random 4~KB~IO operations per second (8x higher than a state-of-the-art flash-based SSD). Moneta can perform a 512-byte write in 9~us (5.6x faster than the SSD). Moneta provides a harmonic mean speedup of 2.1x and a maximum speed up of 9x across a range of file system, paging, and database workloads. We also explore trade-offs in Moneta's architecture between performance, power, memory organization, and memory latency.

[1]  Toby J. Teorey,et al.  A comparative analysis of disk scheduling policies , 1972, CACM.

[2]  Margo I. Seltzer,et al.  Disk Scheduling Revisited , 1990 .

[3]  Spencer W. Ng,et al.  Improving Disk Performance Via Latency Reduction , 1991, IEEE Trans. Computers.

[4]  Yale N. Patt,et al.  Scheduling algorithms for modern disk drives , 1994, SIGMETRICS 1994.

[5]  David Kotz,et al.  A Detailed Simulation Model of the HP 97560 Disk Drive , 1994 .

[6]  John Wilkes,et al.  An introduction to disk drive modeling , 1994, Computer.

[7]  Prashant J. Shenoy,et al.  Cello: A Disk Scheduling Framework for Next Generation Operating Systems* , 1998, SIGMETRICS '98/PERFORMANCE '98.

[8]  J. Griffin,et al.  Designing computer systems with MEMS-based storage , 2000, SIGP.

[9]  Charles M. Lieber,et al.  Carbon nanotube-based nonvolatile random access memory for molecular computing , 2000, Science.

[10]  Gregory R. Ganger,et al.  Timing-Accurate Storage Emulation , 2002, FAST.

[11]  Chang Liu,et al.  Disk scheduling policies with lookahead , 2002, PERV.

[12]  Tara M. Madhyastha,et al.  Proceedings of Fast '03: 2nd Usenix Conference on File and Storage Technologies 2nd Usenix Conference on File and Storage Technologies Optimizing Probe-based Storage , 2022 .

[13]  Arif Merchant,et al.  Using MEMS-Based Storage in Disk Arrays , 2003, FAST.

[14]  Arif Merchant,et al.  Awarded Best Paper! - Using MEMS-Based Storage in Disk Arrays , 2003 .

[15]  Young Jin Nam,et al.  Design and evaluation of an efficient proportional-share disk scheduling algorithm , 2006, Future Gener. Comput. Syst..

[16]  Shyamkumar Thoziyoor,et al.  CACTI 5 . 1 , 2008 .

[17]  B. Dieny,et al.  Spin-dependent phenomena and their implementation in spintronic devices , 2008, 2008 International Symposium on VLSI Technology, Systems and Applications (VLSI-TSA).

[18]  M. Breitwisch Phase Change Memory , 2008, 2008 International Interconnect Technology Conference.

[19]  Jae-Myung Kim,et al.  A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.

[20]  Li-Pin Chang,et al.  A self-balancing striping scheme for NAND-flash storage systems , 2008, SAC '08.

[21]  Peng Li,et al.  Nonvolatile memristor memory: Device characteristics and design implications , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[22]  Hyokyung Bahn,et al.  P/PA-SPTF: Parallelism-aware request scheduling algorithms for MEMS-based storage devices , 2009, TOS.

[23]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[24]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[25]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[26]  Kern Koh,et al.  Comparison of I/O scheduling algorithms for high parallelism MEMS-based storage devices , 2009, ICSE 2009.

[27]  Jun Yang,et al.  A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.

[28]  Shimin Chen,et al.  FlashLogging: exploiting flash devices for synchronous logging performance , 2009, SIGMOD Conference.

[29]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[30]  Hyunjin Lee,et al.  Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[31]  Onur Mutlu,et al.  Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.

[32]  Bruce Jacob,et al.  The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization , 2009, ISCA '09.

[33]  Tajana Simunic,et al.  PDRAM: A hybrid PRAM and DRAM main memory system , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[34]  Antony I. T. Rowstron,et al.  Migrating server storage to SSDs: analysis of tradeoffs , 2009, EuroSys '09.

[35]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[36]  S. Parkin Racetrack memory: A storage class memory based on current controlled magnetic domain wall motion , 2009, 2009 Device Research Conference.

[37]  Jongmoo Choi,et al.  Disk schedulers for solid state drivers , 2009, EMSOFT '09.

[38]  John D. Davis,et al.  FRP: A Nonvolatile Memory Research Platform Targeting NAND Flash , 2009 .

[39]  Steven Swanson,et al.  Beyond the datasheet: Using test beds to probe non-volatile memories' dark secrets , 2010, 2010 IEEE Globecom Workshops.

[40]  David J. Lilja,et al.  High performance solid state storage under Linux , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[41]  Marcus P. Dunn,et al.  A New I/O Scheduler for Solid State Devices , 2010 .

[42]  Sandeep K. S. Gupta,et al.  DASH: a Recipe for a Flash-based Data Intensive Supercomputer , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[43]  Jihong Kim,et al.  BlueSSD: An Open Platform for Cross-layer Experiments for NAND Flash-based SSDs , 2010 .

[44]  Gokul B. Kandiraju,et al.  Modeling and simulating flash based solid-state disks for operating systems , 2010, WOSP/SIPEW '10.