Gridiron: A Technique for Augmenting Cloud Workloads with Network Bandwidth Requirements
暂无分享,去创建一个
Ivan Beschastnikh | Alan J. Hu | Margo Seltzer | Shane Bergsma | Nodir Kodirov | Syed M. Iqbal | S. Bergsma | A. Hu | Ivan Beschastnikh | N. Kodirov | M. Seltzer
[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[2] J. Christopher Beck,et al. Generating Complex, Realistic Cloud Workloads using Recurrent Neural Networks , 2021, SOSP.
[3] Gene M. Amdahl,et al. Computer Architecture and Amdahl's Law , 2007, Computer.
[4] Ricardo Bianchini,et al. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.
[5] Yifei Yuan,et al. On the feasibility of automation for bandwidth allocation problems in data centers , 2013, 2013 Formal Methods in Computer-Aided Design.
[6] Tian Zhou,et al. DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR Predictions in Ad Serving , 2020, WSDM.
[7] Harsh Chawla,et al. Azure Kubernetes Service , 2019 .
[8] David Patterson,et al. MLPerf Training Benchmark , 2019, MLSys.
[9] Helen J. Wang,et al. SecondNet: a data center network virtualization architecture with bandwidth guarantees , 2010, CoNEXT.
[10] Kang G. Shin,et al. Tiresias: A GPU Cluster Manager for Distributed Deep Learning , 2019, NSDI.
[11] Sangeetha Abdu Jyothi,et al. TicTac: Accelerating Distributed Deep Learning with Communication Scheduling , 2018, MLSys.
[12] Zhibin Yu,et al. The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace , 2018, SoCC.
[13] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[14] Victor I. Chang,et al. The efficient framework and algorithm for provisioning evolving VDC in federated data centers , 2017, Future Gener. Comput. Syst..
[15] Kai Chen,et al. Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[16] Lin Li,et al. Towards Robust Green Virtual Cloud Data Center Provisioning , 2017, IEEE Transactions on Cloud Computing.
[17] Gennady Pekhimenko,et al. Priority-based Parameter Propagation for Distributed DNN Training , 2019, SysML.
[18] Wei Wang,et al. Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud , 2019, SoCC.
[19] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Terry Clark,et al. Parallel Computing , 2017, Encyclopedia of GIS.
[22] Ivan Beschastnikh,et al. Scalable Constraint-based Virtual Data Center Allocation , 2017, IJCAI.
[23] Mor Harchol-Balter,et al. Borg: the next generation , 2020, EuroSys.
[24] Ricardo Bianchini,et al. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider , 2020, USENIX ATC.
[25] A. Rowstron,et al. Towards predictable datacenter networks , 2011, SIGCOMM.
[26] Randy H. Katz,et al. Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.
[27] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[28] Albert G. Greenberg,et al. VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.
[29] Panos Kalnis,et al. Scaling Distributed Machine Learning with In-Network Aggregation , 2019, NSDI.
[30] Chen Feng,et al. Performance Characterization of Hadoop and Data MPI Based on Amdahl's Second Law , 2014, 2014 9th IEEE International Conference on Networking, Architecture, and Storage.
[31] Yibo Zhu,et al. A generic communication scheduler for distributed DNN training acceleration , 2019, SOSP.
[32] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[33] Ahmed Amokrane,et al. Greenhead: Virtual Data Center Embedding across Distributed Infrastructures , 2013, IEEE Transactions on Cloud Computing.
[34] Minjae Kim,et al. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation , 2019, ICLR.
[35] T. Moscibroda,et al. Protean: VM Allocation Service at Scale , 2020, OSDI.
[36] Tat-Seng Chua,et al. Neural Collaborative Filtering , 2017, WWW.
[37] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[38] Wencong Xiao,et al. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads , 2019, USENIX Annual Technical Conference.