KEA: Tuning an Exabyte-Scale Data Infrastructure
暂无分享,去创建一个
Carlo Curino | Subru Krishnan | Konstantinos Karanasos | Sarvesh Sakalanaga | Isha Tarte | Conor Power | Kartheek Muthyala | Sudhir Darbha | Deli Zhang | Manoj Kumar | Yiwen Zhu | Abhishek Modi | Manoj Kumar | Nick Jurgens | Minu Iyer | Ankita Agarwal | Konstantinos Karanasos | C. Curino | Subru Krishnan | Sarvesh Sakalanaga | Yiwen Zhu | Conor Power | Abhishek Modi | Kartheek Muthyala | Deli Zhang | A. Agarwal | Minu Iyer | Nick Jurgens | Isha Tarte | Sudhir Darbha
[1] Wei Lin,et al. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.
[2] Srikanth Kandula,et al. Recurring job optimization in scope , 2012, SIGMOD Conference.
[3] Michael C. Huang,et al. Hadoop Configuration Tuning With Ensemble Modeling and Metaheuristic Optimization , 2018, IEEE Access.
[4] Lin Ma,et al. External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems , 2019, IEEE Data Eng. Bull..
[5] Shivaram Venkataraman,et al. Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs , 2020, HotStorage.
[6] Ricardo Bianchini,et al. Toward ML-centric cloud platforms , 2020, Commun. ACM.
[7] Srikanth Kandula,et al. Jockey: guaranteed job latency in data parallel clusters , 2012, EuroSys '12.
[8] Nicolas Bruno,et al. SCOPE: parallel databases meet MapReduce , 2012, The VLDB Journal.
[9] P. Holland. Statistics and Causal Inference , 1985 .
[10] Jingren Zhou,et al. SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..
[11] Arif Merchant,et al. Janus: Optimal Flash Provisioning for Cloud Storage Workloads , 2013, USENIX Annual Technical Conference.
[12] Wei Zheng,et al. Automatic configuration of internet services , 2007, EuroSys '07.
[13] Jordan Tigani,et al. Google BigQuery Analytics , 2014 .
[14] Nando de Freitas,et al. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.
[15] Geoffrey J. Gordon,et al. Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.
[16] Eshcar Hillel,et al. Predicting Execution Bottlenecks in Map-Reduce Clusters , 2012, HotCloud.
[17] Student,et al. THE PROBABLE ERROR OF A MEAN , 1908 .
[18] Chris Douglas,et al. Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics , 2017, SIGMOD Conference.
[19] Jaliya Ekanayake,et al. Hyper Dimension Shuffle: Efficient Data Repartition at Petabyte Scale in Scope , 2019, Proc. VLDB Endow..
[20] Li Zhang,et al. MRONLINE: MapReduce online performance tuning , 2014, HPDC '14.
[21] T. Moscibroda,et al. Protean: VM Allocation Service at Scale , 2020, OSDI.
[22] Stefan Wager,et al. Estimation and Inference of Heterogeneous Treatment Effects using Random Forests , 2015, Journal of the American Statistical Association.
[23] Matt J. Kusner,et al. Bayesian Optimization with Inequality Constraints , 2014, ICML.
[24] Johannes Gehrke,et al. Reinforcement learning for bandwidth estimation and congestion control in real-time communications , 2019, ArXiv.
[25] Abhishek Verma,et al. Large-scale cluster management at Google with Borg , 2015, EuroSys.
[26] Parthasarathy Ranganathan,et al. The Datacenter as a Computer: Designing Warehouse-Scale Machines, Third Edition , 2018, The Datacenter as a Computer.
[27] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[28] Xiaoyu Chen,et al. JetScope: Reliable and Interactive Analytics at Cloud Scale , 2015, Proc. VLDB Endow..
[29] Arif Merchant,et al. Take me to your leader! Online Optimization of Distributed Storage Configurations , 2015, Proc. VLDB Endow..
[30] Liang Dong,et al. Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.
[31] Harald C. Gall,et al. Software Engineering for Machine Learning: A Case Study , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).
[32] Frank Dehne,et al. Automatic, On-Line Tuning of YARN Container Memory and CPU Parameters , 2016, 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS).
[33] Jingren Zhou,et al. Incorporating partitioning and parallel plans into the SCOPE optimizer , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).
[34] J. Mockus. Bayesian Approach to Global Optimization: Theory and Applications , 1989 .
[35] Carlo Curino,et al. Unearthing inter-job dependencies for better cluster scheduling , 2020, OSDI.
[36] Minlan Yu,et al. CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics , 2017, NSDI.
[37] Nicolas Bruno,et al. Continuous Cloud-Scale Query Optimization and Processing , 2013, Proc. VLDB Endow..
[38] Pete Wyckoff,et al. Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..
[39] Kushal Datta,et al. Gunther: Search-Based Auto-Tuning of MapReduce , 2013, Euro-Par.
[40] Kamal K.c.,et al. Performance Tuning of MapReduce Programs , 2015 .
[41] Carlo Curino,et al. Hydra: a federated resource manager for data-center scale analytics , 2019, NSDI.
[42] Carlo Curino,et al. Griffon: Reasoning about Job Anomalies with Unlabeled Data in Cloud-based Platforms , 2019, SoCC.
[43] Wei Lin,et al. Microsoft Bing Peking University , 2022 .
[44] Herodotos Herodotou,et al. Profiling, what-if analysis, and cost-based optimization of MapReduce programs , 2011, Proc. VLDB Endow..
[45] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.
[46] Dorothea Heiss-Czedik,et al. An Introduction to Genetic Algorithms. , 1997, Artificial Life.
[47] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.
[48] Christoforos E. Kozyrakis,et al. Selecta: Heterogeneous Cloud Storage Configuration for Data Analytics , 2018, USENIX Annual Technical Conference.
[49] Jon Howell,et al. Slicer: Auto-Sharding for Datacenter Applications , 2016, OSDI.
[50] A. Owen. A robust hybrid of lasso and ridge regression , 2006 .
[51] Alekh Jindal,et al. Peregrine: Workload Optimization for Cloud Query Engines , 2019, SoCC.
[52] Ke Zhou,et al. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning , 2019, SIGMOD Conference.
[53] Wei Lin,et al. Advanced partitioning techniques for massively distributed computation , 2012, SIGMOD Conference.
[54] Carlo Curino,et al. Morpheus: Towards Automated SLOs for Enterprise Clusters , 2016, OSDI.
[55] Carlo Curino,et al. MLOS: An Infrastructure for Automated Software Performance Engineering , 2020, DEEM@SIGMOD.
[56] Carlo Curino,et al. Mercury: Hybrid Centralized and Distributed Scheduling in Large Shared Clusters , 2015, USENIX Annual Technical Conference.
[57] Randy H. Katz,et al. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.