Unearthing inter-job dependencies for better cluster scheduling
暂无分享,去创建一个
Carlo Curino | Gregory R. Ganger | Subru Krishnan | Konstantinos Karanasos | Andrew Chung | G. Ganger | Andrew Chung | Konstantinos Karanasos | C. Curino | Subru Krishnan
[1] David E. Culler,et al. User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).
[2] David E. Irwin,et al. Balancing risk and reward in a market-based task service , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..
[3] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[4] John Wilkes,et al. Profitable services in an uncertain world , 2005, ACM/IEEE SC 2005 Conference (SC'05).
[5] Marta Mattoso,et al. Provenance Services for Distributed Workflows , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).
[6] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[7] Liang Zhong,et al. EnaCloud: An Energy-Saving Application Live Placement Approach for Cloud Computing Environments , 2009, 2009 IEEE International Conference on Cloud Computing.
[8] Andrew V. Goldberg,et al. Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.
[9] Magdalena Balazinska,et al. Estimating the progress of MapReduce pipelines , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[10] Magdalena Balazinska,et al. ParaTimer: a progress indicator for MapReduce DAGs , 2010, SIGMOD Conference.
[11] Rajkumar Buyya,et al. Adaptive threshold-based approach for energy-efficient consolidation of virtual machines in cloud data centers , 2010, MGC '10.
[12] Lenin Ravindranath,et al. Nectar: Automatic Management of Data and Computation in Datacenters , 2010, OSDI.
[13] Christopher Ré,et al. Automatic Optimization for MapReduce Programs , 2011, Proc. VLDB Endow..
[14] Benjamin Hindman,et al. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.
[15] Srikanth Kandula,et al. Jockey: guaranteed job latency in data parallel clusters , 2012, EuroSys '12.
[16] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.
[17] Rajkumar Buyya,et al. Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in Cloud data centers , 2012, Concurr. Comput. Pract. Exp..
[18] Randy H. Katz,et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.
[19] Michael Abd-El-Malek,et al. Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.
[20] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[21] Patrick Wendell,et al. Sparrow: distributed, low latency scheduling , 2013, SOSP.
[22] Xin Chen,et al. Failure Analysis of Jobs in Compute Clouds: A Google Cluster Case Study , 2014, 2014 IEEE 25th International Symposium on Software Reliability Engineering.
[23] Wei Lin,et al. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.
[24] Abhishek Verma,et al. Large-scale cluster management at Google with Borg , 2015, EuroSys.
[25] Aditya G. Parameswaran,et al. DataHub: Collaborative Data Science & Dataset Version Management at Scale , 2014, CIDR.
[26] Carlo Curino,et al. Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters , 2015, USENIX Annual Technical Conference.
[27] Anne-Marie Kermarrec,et al. Hawk: Hybrid Datacenter Scheduling , 2015, USENIX Annual Technical Conference.
[28] Andrea Rosà,et al. Predicting and Mitigating Jobs Failures in Big Data Clusters , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.
[29] Angela H. Jiang,et al. JamaisVu: Robust Scheduling with Auto-Estimated Job Runtimes , 2016 .
[30] Carlo Curino,et al. Morpheus: Towards Automated SLOs for Enterprise Clusters , 2016, OSDI.
[31] Srikanth Kandula,et al. This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Graphene: Packing and Dependency-aware Scheduling for Data-parallel Clusters G: Packing and Dependency-aware Scheduling for Data-parallel Clusters , 2022 .
[32] Mor Harchol-Balter,et al. TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters , 2016, EuroSys.
[33] Aditya Akella,et al. Altruistic Scheduling in Multi-Resource Clusters , 2016, OSDI.
[34] Alon Y. Halevy,et al. Goods: Organizing Google's Datasets , 2016, SIGMOD Conference.
[35] Robert N. M. Watson,et al. Firmament: Fast, Centralized Cluster Scheduling at Scale , 2016, OSDI.
[36] Bianca Schroeder,et al. Learning from Failure Across Multiple Clusters: A Trace-Driven Approach to Understanding, Predicting, and Mitigating Job Terminations , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[37] Paul Voigt,et al. The Eu General Data Protection Regulation (Gdpr): A Practical Guide , 2017 .
[38] Carlo Curino,et al. Dependency-Driven Analytics: A Compass for Uncharted Data Oceans , 2017, CIDR.
[39] Ricardo Bianchini,et al. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.
[40] Peter R. Pietzuch,et al. Medea: scheduling of long running applications in shared production clusters , 2018, EuroSys.
[41] Gregory R. Ganger,et al. Stratus: cost-aware container scheduling in the public cloud , 2018, SoCC.
[42] Zhibin Yu,et al. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace , 2018, SoCC.
[43] Hiren Patel,et al. Computation Reuse in Analytics Job Service at Microsoft , 2018, SIGMOD Conference.
[44] Willy Zwaenepoel,et al. Kairos: Preemptive Data Center Scheduling Without Runtime Estimates , 2018, SoCC.
[45] Gregory R. Ganger,et al. 3Sigma: distribution-based cluster scheduling for runtime uncertainty , 2018, EuroSys.
[46] Hiren Patel,et al. Selecting Subexpressions to Materialize at Datacenter Scale , 2018, Proc. VLDB Endow..
[47] Gregory R. Ganger,et al. On the diversity of cluster workloads and its impact on research results , 2018, USENIX Annual Technical Conference.
[48] Peter R. Pietzuch,et al. Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications , 2019, SoCC.
[49] Carlo Curino,et al. Hydra: a federated resource manager for data-center scale analytics , 2019, NSDI.
[50] Carlo Curino,et al. Peering through the Dark: An Owl's View of Inter-job Dependencies and Jobs' Impact in Shared Clusters , 2019, SIGMOD Conference.
[51] Carlo Curino,et al. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms , 2019, SoCC.
[52] Wei Wang,et al. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud , 2019, SoCC.
[53] Jing Guo,et al. Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces , 2019, 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS).
[54] Hongzi Mao,et al. Learning scheduling algorithms for data processing clusters , 2018, SIGCOMM.
[55] Alekh Jindal,et al. Peregrine: Workload Optimization for Cloud Query Engines , 2019, SoCC.
[56] Alekh Jindal,et al. AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft , 2020, Proc. VLDB Endow..