SOL: Safe On-Node Learning in Cloud Platforms
暂无分享,去创建一个
Ricardo Bianchini | Christos Kozyrakis | Yawen Wang | Neeraja J. Yadwadkar | Daniel Crankshaw | Daniel Berger
[1] Christoforos E. Kozyrakis,et al. SmartHarvest: harvesting idle CPUs safely and efficiently in the cloud , 2021, EuroSys.
[2] Brandon Lucia,et al. Adaptive low-overhead scheduling for periodic and reactive intermittent execution , 2020, PLDI.
[3] Zi Yan,et al. Nimble Page Management for Tiered Memory Systems , 2019, ASPLOS.
[4] Ricardo Bianchini,et al. Toward ML-centric cloud platforms , 2020, Commun. ACM.
[5] Daniel Sánchez,et al. Tailbench: a benchmark suite and evaluation methodology for latency-critical applications , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).
[6] Christopher Olston,et al. TensorFlow-Serving: Flexible, High-Performance ML Serving , 2017, ArXiv.
[7] Abhishek Verma,et al. Large-scale cluster management at Google with Borg , 2015, EuroSys.
[8] Xin Zhang,et al. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform , 2017, KDD.
[9] Hongzi Mao,et al. Towards Safe Online Reinforcement Learning in Computer Systems , 2019 .
[10] Ricardo Bianchini,et al. Cost-Efficient Overclocking in Immersion-Cooled Datacenters , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).
[11] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.
[12] Michael I. Jordan,et al. The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox , 2014, CIDR.
[13] Xin Wang,et al. Clipper: A Low-Latency Online Prediction Serving System , 2016, NSDI.
[14] Hongzi Mao,et al. Placeto: Efficient Progressive Device Placement Optimization , 2018 .
[15] Aleksandrs Slivkins,et al. Introduction to Multi-Armed Bandits , 2019, Found. Trends Mach. Learn..
[16] Benjamin Van Roy,et al. A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..
[17] W. B. Roberts,et al. Machine Learning: The High Interest Credit Card of Technical Debt , 2014 .
[18] Jeongseob Ahn,et al. Exploring the Design Space of Page Management for Multi-Tiered Memory Systems , 2021, USENIX Annual Technical Conference.
[19] Ricardo Bianchini,et al. Prediction-Based Power Oversubscription in Cloud Platforms , 2020, USENIX Annual Technical Conference.
[20] Brandon Lucia,et al. Automatically enforcing fresh and consistent inputs in intermittent systems , 2021, PLDI.
[21] Heon Y. Yeom,et al. Profiling Dynamic Data Access Patterns with Controlled Overhead and Quality , 2019, Middleware Industry.
[22] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[23] Thierry Coppey,et al. SmartChoices: Hybridizing Programming and Machine Learning , 2019 .
[24] Paul M. Carpenter,et al. Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[25] Jichuan Chang,et al. Software-Defined Far Memory in Warehouse-Scale Computers , 2019, ASPLOS.
[26] Ricardo Bianchini,et al. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.
[27] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[28] Henry Hoffmann,et al. ESP: A Machine Learning Approach to Predicting Application Interference , 2017, 2017 IEEE International Conference on Autonomic Computing (ICAC).
[29] T. Moscibroda,et al. Protean: VM Allocation Service at Scale , 2020, OSDI.
[30] Edward Edberg Halim,et al. LinnOS: Predictability on Unpredictable Flash Storage with a Light Neural Network , 2020, OSDI.
[31] Luiz André Barroso,et al. The tail at scale , 2013, CACM.