Slack squeeze coded computing for adaptive straggler mitigation
暂无分享,去创建一个
Amir Salman Avestimehr | Murali Annavaram | Mehrdad Kiamari | Krishna Giri Narra | Zhifeng Lin | M. Annavaram | A. Avestimehr | Krishnagiri Narra | Zhifeng Lin | Mehrdad Kiamari
[1] Ray Hill,et al. A First Course in Coding Theory , 1988 .
[2] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[3] Chris Peterson,et al. Implementing a Performance Forecasting System for Metacomputing The Network Weather Service , 1997, ACM/IEEE SC 1997 Conference (SC'97).
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Peter A. Dinda. Online prediction of the running time of tasks , 2001, SIGMETRICS '01.
[6] Randy H. Katz,et al. Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.
[7] Albert G. Greenberg,et al. Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.
[8] Scott Shenker,et al. Spark: Cluster Computing with Working Sets , 2010, HotCloud.
[9] Jiagui Chen. Economy of China Analysis and Forecast (2013) , 2012 .
[10] Scott Shenker,et al. Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .
[11] Luiz André Barroso,et al. The tail at scale , 2013, CACM.
[12] Nihar B. Shah,et al. When do redundant requests reduce latency ? , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[13] P. Renteln. Manifolds, Tensors, and Forms: An Introduction for Mathematicians and Physicists , 2013 .
[14] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.
[15] Christoforos E. Kozyrakis,et al. Reconciling high server utilization and sub-millisecond quality-of-service , 2014, EuroSys '14.
[16] Jialin Li,et al. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.
[17] Gauri Joshi,et al. Efficient task replication for fast response times in parallel computation , 2014, SIGMETRICS.
[18] Abhishek Gupta,et al. Parallel Programming with Migratable Objects: Charm++ in Practice , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Daniel Sánchez,et al. Ubik: efficient cache sharing with strict qos for latency-critical workloads , 2014, ASPLOS.
[20] Gregory W. Wornell,et al. Efficient task replication for fast response times in parallel computation , 2014, SIGMETRICS '14.
[21] Christoforos E. Kozyrakis,et al. Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[22] Kannan Ramchandran,et al. On scheduling redundant requests with cancellation overheads , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[23] Thu D. Nguyen,et al. ApproxHadoop: Bringing Approximations to MapReduce Frameworks , 2015, ASPLOS.
[24] Erik Saule,et al. Replicated Data Placement for Uncertain Scheduling , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop.
[25] Mor Harchol-Balter,et al. Reducing Latency via Redundant Requests: Exact Analysis , 2015, SIGMETRICS 2015.
[26] Dimitris S. Papailiopoulos,et al. Speeding up distributed machine learning using codes , 2016, ISIT.
[27] Mohammad Ali Maddah-Ali,et al. A Unified Coding Framework for Distributed Computing with Straggling Servers , 2016, 2016 IEEE Globecom Workshops (GC Wkshps).
[28] Lingjia Tang,et al. Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[29] Mohammad Ali Maddah-Ali,et al. Coded TeraSort , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[30] Mohammad Ali Maddah-Ali,et al. Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.
[31] Ronald G. Dreslinski,et al. Reining in Long Tails in Warehouse-Scale Computers with Quick Voltage Boosting Using Adrenaline , 2017, ACM Trans. Comput. Syst..
[32] Mor Harchol-Balter,et al. WorkloadCompactor: reducing datacenter cost while providing tail latency SLO guarantees , 2017, SoCC.
[33] Soummya Kar,et al. Coding Method for Parallel Iterative Linear Solver , 2017, ArXiv.
[34] Vipul Gupta,et al. A sequential approximation framework for coded distributed optimization , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[35] Amir Salman Avestimehr,et al. Coded computation over heterogeneous clusters , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[36] Shivaram Venkataraman,et al. Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation , 2018, ArXiv.
[37] A. Salman Avestimehr,et al. A Fundamental Tradeoff Between Computation and Communication in Distributed Computing , 2016, IEEE Transactions on Information Theory.
[38] Vipul Gupta,et al. OverSketch: Approximate Matrix Multiplication for the Cloud , 2018, 2018 IEEE International Conference on Big Data (Big Data).
[39] Xuehai Qian,et al. Hop: Heterogeneity-aware Decentralized Training , 2019, ASPLOS.
[40] Amir Salman Avestimehr,et al. Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy , 2018, AISTATS.
[41] Pulkit Grover,et al. “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.
[42] Shivaram Venkataraman,et al. Parity Models: A General Framework for Coding-Based Resilience in ML Inference , 2019, ArXiv.
[43] Amir Salman Avestimehr,et al. Collage Inference: Tolerating Stragglers in Distributed Neural Network Inference using Coding , 2019, ArXiv.
[44] Amir Salman Avestimehr,et al. CodedPrivateML: A Fast and Privacy-Preserving Framework for Distributed Machine Learning , 2019, IEEE Journal on Selected Areas in Information Theory.