CHAMELEON: A Self-Evolving, Fully-Adaptive Resource Arbitrator for Storage Systems

Enterprise applications typically depend on guaranteed performance from the storage subsystem, lest they fail. However, unregulated competition is unlikely to result in a fair, predictable apportioning of resources. Given that widespread access protocols and scheduling policies are largely best-effort, the problem of providing performance guarantees on a shared system is a very difficult one. Clients typically lack accurate information on the storage system's capabilities and on the access patterns of the workloads using it, thereby compounding the problem. CHAMELEON is an adaptive arbitrator for shared storage resources; it relies on a combination of self-refining models and constrained optimization to provide performance guarantees to clients. This process depends on minimal information from clients, and is fully adaptive; decisions are based on device and workload models automatically inferred, and continuously refined, at run-time. Corrective actions taken by CHAMELEON are only as radical as warranted by the current degree of knowledge about the system's behavior. In our experiments on a real storage system CHAMELEON identified, analyzed, and corrected performance violations in 3-14 minutes--which compares very favorably with the time a human administrator would have needed. Our learning-based paradigm is a most promising way of deploying large-scale storage systems that service variable workloads on an ever-changing mix of device types.

[1]  Daniel A. Reed,et al.  ARIMA time series modeling and forecasting for adaptive I/O prefetching , 2001, ICS '01.

[2]  Arif Merchant,et al.  Minerva: An automated resource provisioning tool for large-scale storage systems , 2001, TOCS.

[3]  E. Anderson HPL – SSP – 2001 – 4 : Simple table-based modeling of storage devices , 2001 .

[4]  Arif Merchant,et al.  A modular, analytical throughput model for modern disk arrays , 2001, MASCOTS 2001, Proceedings Ninth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[5]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[6]  Arif Merchant,et al.  Capacity planning with phased workloads , 1998, WOSP '98.

[7]  Xiaoyun Zhu,et al.  Triage: performance isolation and differentiation for storage systems , 2004, Twelfth IEEE International Workshop on Quality of Service, 2004. IWQOS 2004..

[8]  Mark H. Ellisman,et al.  Data-intensive e-science frontier research , 2003, CACM.

[9]  Wei Jin,et al.  Interposed proportional sharing for a storage service utility , 2004, SIGMETRICS '04/Performance '04.

[10]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[11]  Jian Xu,et al.  Performance virtualization for large-scale storage systems , 2003, 22nd International Symposium on Reliable Distributed Systems, 2003. Proceedings..

[12]  Giuseppe Serazzi,et al.  Workload characterization: a survey , 1993, Proc. IEEE.

[13]  Joseph S. Glider,et al.  The software architecture of a SAN storage control system , 2003, IBM Syst. J..

[14]  Kaladhar Voruganti,et al.  Polus: Growing Storage QoS Management Beyond a "4-Year Old Kid" , 2004, FAST.

[15]  Arif Merchant,et al.  Façade: Virtual Storage Devices with Performance Guarantees , 2003, FAST.

[16]  Christos Faloutsos,et al.  Storage device performance prediction with CART models , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[17]  Eric Anderson,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Hippodrome: Running Circles around Storage Administration , 2022 .

[18]  Dinesh C. Verma,et al.  Simplifying network administration using policy-based management , 2002, IEEE Netw..

[19]  Gang Peng,et al.  Multi-dimensional storage virtualization , 2004, SIGMETRICS '04/Performance '04.

[20]  Juan C. Meza,et al.  OPT++: An object-oriented class library for nonlinear optimization , 1994 .

[21]  Chenyang Lu,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Aqueduct: Online Data Migration with Performance Guarantees , 2022 .

[22]  Margo I. Seltzer,et al.  Using probabilistic reasoning to automate software tuning , 2004, SIGMETRICS '04/Performance '04.