KML: Using Machine Learning to Improve Storage Systems

Operating systems include many heuristic algorithms designed to improve overall storage performance and throughput. Because such heuristics cannot work well for all conditions and workloads, system designers resorted to exposing numerous tunable parameters to users—thus burdening users with continually optimizing their own storage systems and applications. Storage systems are usually responsible for most latency in I/O-heavy applications, so even a small latency improvement can be significant. Machine learning (ML) techniques promise to learn patterns, generalize from them, and enable optimal solutions that adapt to changing workloads. We propose that ML solutions become a first-class component in OSs and replace manual heuristics to optimize storage systems dynamically. In this paper, we describe our proposed ML architecture, called KML. We developed a prototype KML architecture and applied it to two case studies: optimizing readahead and NFS read-size values. Our experiments show that KML consumes less than 4KB of dynamic kernel memory, has a CPU overhead smaller than 0.2%, and yet can learn patterns and improve I/O throughput by as much as 2.3× and 15× for two case studies—even for complex, never-seen-before, concurrently running mixed workloads on different storage devices.

[1]  Ion Stoica,et al.  Tune: A Research Platform for Distributed Model Selection and Training , 2018, ArXiv.

[2]  Randal C. Burns,et al.  Using multiple predictors to improve the accuracy of file access predictions , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[3]  Andrew A. Chien,et al.  MittOS: Supporting Millisecond Tail Tolerance with Fast Rejecting SLO-Aware OS Interface , 2017, SOSP.

[4]  Yuan-Hao Chang,et al.  DeepPrefetcher: A Deep Learning Framework for Data Prefetching in Flash Storage Devices , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Zhichao Li,et al.  On the Importance of Evaluating Storage Systems' $Costs , 2014, HotStorage.

[6]  Ran El-Yaniv,et al.  Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..

[7]  Song Jiang,et al.  STEP: Sequentiality and Thrashing Detection Based Prefetching to Improve Performance of Networked Storage Servers , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[8]  H. Howie Huang,et al.  Flashy prefetching for high-performance flash drives , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[9]  Yi Liu,et al.  I/O Feature-based File Prefetching for Multi-Applications , 2010, 2010 Ninth International Conference on Grid and Cloud Computing.

[10]  C. Fox,et al.  Quantifying Temporal and Spatial Localities in Storage Workloads and Transformations by Data Path Components , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[11]  Vikas Chandra,et al.  Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations , 2017, ArXiv.

[12]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[13]  Scott Klasky,et al.  Stacker: An Autonomic Data Movement Engine for Extreme-Scale Data Staging-Based In-Situ Workflows , 2018, SC18: International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Darrell D. E. Long,et al.  Design and Implementation of a Predictive File Prefetching Algorithm , 2001, USENIX Annual Technical Conference, General Track.

[15]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[16]  Swagath Venkataramani,et al.  Accurate and Efficient 2-bit Quantized Neural Networks , 2019, MLSys.

[17]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[18]  Yingyan Lin,et al.  Toward reconfigurable kernel datapaths with learned optimizations , 2021, HotOS.

[19]  Giri Narasimhan,et al.  Driving Cache Replacement with ML-based LeCaR , 2018, HotStorage.

[20]  Zhen Cao,et al.  Carver: Finding Important Parameters for Storage System Tuning , 2020, FAST.

[21]  Li Li,et al.  Performing Initiative Data Prefetching in Distributed File Systems for Cloud Computing , 2017, IEEE Transactions on Cloud Computing.

[22]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[23]  Geoffrey E. Hinton,et al.  Neural Additive Models: Interpretable Machine Learning with Neural Nets , 2020, NeurIPS.

[24]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[25]  Andrea C. Arpaci-Dusseau,et al.  From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees , 2020, OSDI.

[26]  Wojciech Samek,et al.  Toward Interpretable Machine Learning: Transparent Deep Neural Networks and Beyond , 2020, ArXiv.

[27]  Erez Zadok,et al.  Evaluating Performance and Energy in File System Server Workloads , 2010, FAST.

[28]  Henry Hoffmann,et al.  Metronome: Operating system level performance management via self-adaptive computing , 2012, DAC Design Automation Conference 2012.

[29]  Heiner Litz,et al.  Learning I/O Access Patterns to Improve Prefetching in SSDs , 2020, ECML/PKDD.

[30]  Bianca Schroeder,et al.  SSD-based Workload Characteristics and Their Performance Implications , 2021, ACM Trans. Storage.

[31]  T. Ragunathan,et al.  Improving Performance of Distributed File System through Frequent Block Access Pattern-Based Prefetching Algorithm , 2019, 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT).

[32]  Edward Edberg Halim,et al.  LinnOS: Predictability on Unpredictable Flash Storage with a Light Neural Network , 2020, OSDI.

[33]  Kunle Olukotun,et al.  Understanding and optimizing asynchronous low-precision stochastic gradient descent , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[34]  K. Pearson VII. Note on regression and inheritance in the case of two parents , 1895, Proceedings of the Royal Society of London.

[35]  Soon J. Hyun,et al.  APS: adaptable prefetching scheme to different running environments for concurrent read streams in distributed file systems , 2018, The Journal of Supercomputing.

[36]  Kyungtae Kang,et al.  iFetcher: User-Level Prefetching Framework With File-System Event Monitoring for Linux , 2018, IEEE Access.

[37]  Zhen Cao,et al.  Towards Better Understanding of Black-box Auto-Tuning: A Comparative Analysis for Storage Systems , 2018, USENIX Annual Technical Conference.

[38]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[39]  Frank Singhoff,et al.  Lynx: a learning linux prefetching mechanism for SSD performance model , 2016, 2016 5th Non-Volatile Memory Systems and Applications Symposium (NVMSA).

[40]  Hamed Haddadi,et al.  Running Neural Networks on the NIC , 2020, 2009.02353.

[41]  Xiaofei Xu,et al.  Frequent Access Pattern-based Prefetching Inside of Solid-State Drives , 2020, 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[42]  Jehan-François Pâris,et al.  Making Early Predictions of File Accesses , 2005 .

[43]  Erez Zadok,et al.  A Machine Learning Framework to Improve Storage System Performance , 2021, HotStorage.

[44]  Mani B. Srivastava,et al.  How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods , 2020, NeurIPS.

[45]  H. Robbins A Stochastic Approximation Method , 1951 .

[46]  Erez Zadok,et al.  Re-Animator: Versatile High-Fidelity Storage-System Tracing and Replaying , 2020, SYSTOR.

[47]  Kunle Olukotun,et al.  High-Accuracy Low-Precision Training , 2018, ArXiv.

[48]  Tim Kraska,et al.  The Case for Learned Index Structures , 2018 .

[49]  Hongsheng Xi,et al.  Evaluation and Optimization of Kernel File Readaheads Based on Markov Decision Models , 2011, Comput. J..

[50]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[51]  Yusik Kim,et al.  Data Prefetching for Large Tiered Storage Systems , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[52]  Tim Kraska,et al.  SageDB: A Learned Database System , 2019, CIDR.

[53]  Faisal Zaman,et al.  What is TensorFlow Lite , 2020 .

[54]  Colin Raffel,et al.  Learning-based Memory Allocation for C++ Server Workloads , 2020, ASPLOS.

[55]  Sachin S. Talathi,et al.  Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.

[56]  Linpeng Huang,et al.  Adaptive Prefetching for Accelerating Read and Write in NVM-Based File Systems , 2017, 2017 IEEE International Conference on Computer Design (ICCD).

[57]  Daniel A. Reed,et al.  Automatic ARIMA time series modeling for adaptive I/O prefetching , 2004, IEEE Transactions on Parallel and Distributed Systems.

[58]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[59]  Hongsheng Xi,et al.  On the design of a new Linux readahead framework , 2008, OPSR.

[60]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[61]  Ravishankar K. Iyer,et al.  Machine learning for load balancing in the Linux kernel , 2020, APSys.

[62]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[63]  Christopher Small,et al.  Why does file system prefetching work? , 1999, USENIX Annual Technical Conference, General Track.

[64]  Tudor Dumitras,et al.  The Broken Shield: Measuring Revocation Effectiveness in the Windows Code-Signing PKI , 2018, USENIX Security Symposium.

[65]  A. Negi,et al.  Applying Machine Learning Techniques to Improve Linux Process Scheduling , 2005, TENCON 2005 - 2005 IEEE Region 10 Conference.

[66]  Zhen Cao,et al.  On the Performance Variation in Modern Storage Stacks , 2017, FAST.

[67]  Chet Juszczak,et al.  Improving the Write Performance of an NFS Server , 1994, USENIX Winter.

[68]  Jian Huang,et al.  A Learning-based Approach Towards Automated Tuning of SSD Configurations , 2021, ArXiv.

[69]  Lawrence O. Hall,et al.  Why are neural networks sometimes much more accurate than decision trees: an analysis on a bio-informatics problem , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[70]  Ahmed Amer,et al.  File access prediction with adjustable accuracy , 2002, Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference (Cat. No.02CH37326).

[71]  Pritish Narayanan,et al.  Deep Learning with Limited Numerical Precision , 2015, ICML.

[72]  Mo Dong,et al.  PCC Vivace: Online-Learning Congestion Control , 2018, NSDI.

[73]  Jian Liu,et al.  Correlation Based File Prefetching Approach for Hadoop , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[74]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[75]  Marcel van Gerven,et al.  Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges , 2018, ArXiv.

[76]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[77]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[78]  Anshul Gandhi,et al.  Towards Optimal Configuration of Microservices , 2021, EuroMLSys@EuroSys.

[79]  Arif Merchant,et al.  An analytic behavior model for disk drives with readahead caches and request reordering , 1998, SIGMETRICS '98/PERFORMANCE '98.

[80]  T E C H N I C A L W H I T E P A P E R Best Practices for Running VMware vSphere® on Network-Attached Storage (NAS) , 2013 .

[81]  Hui Chen,et al.  An RNN Based Mechanism for File Prefetching , 2019, 2019 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES).

[82]  Zhichao Cao,et al.  Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook , 2020, FAST.

[83]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.