George: Learning to Place Long-Lived Containers in Large Clusters with Operation Constraints
暂无分享,去创建一个
Wei Wang | Bo Li | Luping Wang | Suyi Li | Yinghao Yu | Bo Li | Wen Wang | Suyi Li | Yinghao Yu | Luping Wang
[1] Scott Shenker,et al. Shark: SQL and rich analytics at scale , 2012, SIGMOD '13.
[2] Yanpei Chen,et al. Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..
[3] Ameet Talwalkar,et al. MLlib: Machine Learning in Apache Spark , 2015, J. Mach. Learn. Res..
[4] Christoforos E. Kozyrakis,et al. Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[5] Luping Wang,et al. Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters at Scale , 2020, SC20: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Kaushik Veeraraghavan,et al. Canopy: An End-to-End Performance Tracing And Analysis System , 2017, SOSP.
[7] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[8] Jing Guo,et al. Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces , 2019, 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS).
[9] Abhishek Verma,et al. Large-scale cluster management at Google with Borg , 2015, EuroSys.
[10] Tao Huang,et al. Aladdin: Optimized Maximum Flow Management for Shared Production Clusters , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[11] Aman Kansal,et al. Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.
[12] Karthik Narasimhan,et al. Projection-Based Constrained Policy Optimization , 2020, ICLR.
[13] Peter R. Pietzuch,et al. Medea: scheduling of long running applications in shared production clusters , 2018, EuroSys.
[14] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[15] Chuan Wu,et al. Deep Learning-based Job Placement in Distributed Machine Learning Clusters , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.
[16] Kejiang Ye,et al. Imbalance in the cloud: An analysis on Alibaba cluster trace , 2017, 2017 IEEE International Conference on Big Data (Big Data).
[17] Zhibin Yu,et al. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace , 2018, SoCC.
[18] Nikolaj Bjørner,et al. Z3: An Efficient SMT Solver , 2008, TACAS.
[19] Peter Stone,et al. Autonomous transfer for reinforcement learning , 2008, AAMAS.
[20] Hongzi Mao,et al. Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.
[21] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[22] Mahmut T. Kandemir,et al. Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[23] Wei Wang,et al. Continuum: A Platform for Cost-Aware, Low-Latency Continual Learning , 2018, SoCC.
[24] مسعود رسول آبادی,et al. 2011 , 2012, The Winning Cars of the Indianapolis 500.
[25] Scott Shenker,et al. Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.
[26] Xin Wang,et al. Clipper: A Low-Latency Online Prediction Serving System , 2016, NSDI.
[27] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[28] Ricardo Bianchini,et al. DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.
[29] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.
[30] Daniel A. Menascé,et al. TPC-W: A Benchmark for E-Commerce , 2002, IEEE Internet Comput..
[31] MahadevanSridhar,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003 .
[32] Jackson P. Matsuura,et al. Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach , 2010, 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting.
[33] Fabian Hueske,et al. Apache Flink , 2019, Encyclopedia of Big Data Technologies.
[34] Matthias Sax,et al. Apache Kafka , 2019, Encyclopedia of Big Data Technologies.
[35] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[36] Josu Ceberio,et al. Constrained Combinatorial Optimization with Reinforcement Learning , 2020, ArXiv.
[37] Chita R. Das,et al. Modeling and synthesizing task placement constraints in Google compute clusters , 2011, SoCC.
[38] Xiao Zhang,et al. CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.
[39] Michael I. Jordan,et al. Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.
[40] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[41] Christina Delimitrou,et al. Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.
[42] Hongzi Mao,et al. Variance Reduction for Reinforcement Learning in Input-Driven Environments , 2018, ICLR.
[43] Ali Anwar,et al. Characterizing Co-located Datacenter Workloads: An Alibaba Case Study , 2018, APSys.
[44] Archana Ganapathi,et al. The Case for Evaluating MapReduce Performance Using Workload Suites , 2011, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems.
[45] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[46] Sameh Elnikety,et al. Swayam: distributed autoscaling to meet SLAs of machine learning inference services with resource efficiency , 2017, Middleware.
[47] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[48] Carlo Curino,et al. Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications , 2015, SIGMOD Conference.
[49] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[50] Mor Harchol-Balter,et al. TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters , 2016, EuroSys.
[51] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[52] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[53] Adam Silberstein,et al. Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.
[54] John K. Karlof,et al. Integer programming : theory and practice , 2005 .
[55] Lingjia Tang,et al. Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.