论文信息 - Occupy the cloud: distributed computing for the 99% - 字舞流文

Occupy the cloud: distributed computing for the 99%

Distributed computing remains inaccessible to a large number of users, in spite of many open source platforms and extensive commercial offerings. While distributed computation frameworks have moved beyond a simple map-reduce model, many users are still left to struggle with complex cluster management and configuration tools, even for running simple embarrassingly parallel jobs. We argue that stateless functions represent a viable platform for these users, eliminating cluster management overhead, fulfilling the promise of elasticity. Furthermore, using our prototype implementation, PyWren, we show that this model is general enough to implement a number of distributed computing models, such as BSP, efficiently. Extrapolating from recent trends in network bandwidth and the advent of disaggregated storage, we suggest that stateless functions are a natural fit for data processing in future computing environments.

Ion Stoica | Benjamin Recht | Shivaram Venkataraman | Qifan Pu | Eric Jonas | B. Recht | I. Stoica | S. Venkataraman | Qifan Pu | Eric Jonas | Ion Stoica

[1] Carlo Curino,et al. Apache Hadoop YARN: yet another resource negotiator , 2013, SoCC.

[2] Jon Howell,et al. Flat Datacenter Storage , 2012, OSDI.

[3] Ion Stoica,et al. Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics , 2016, NSDI.

[4] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[5] Krste Asanovic,et al. FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers , 2014 .

[6] Jonathan M. Smith,et al. From Lone Dwarfs to Giant Superclusters: Rethinking Operating System Abstractions for the Cloud , 2015, HotOS.

[7] Randy H. Katz,et al. A view of cloud computing , 2010, CACM.

[8] Cordelia Schmid,et al. Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.

[9] Hairong Kuang,et al. The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[10] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[11] Jinyang Li,et al. Piccolo: Building Fast, Distributed Programs with Partitioned Tables , 2010, OSDI.

[12] John F. Canny,et al. Big data analytics with small footprint: squaring the cloud , 2013, KDD.

[13] Lu Fang,et al. Interruptible tasks: treating memory pressure as interrupts for highly scalable data-parallel programs , 2015, SOSP.

[14] Mendel Rosenblum,et al. It's Time for Low Latency , 2011, HotOS.

[15] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[16] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[17] Les Carr,et al. UK Research Software Survey 2014 , 2014 .

[18] Robbert van Renesse,et al. Experiences with the Amoeba distributed operating system , 1990, CACM.

[19] Sylvia Ratnasamy,et al. Large-Scale Computation Not at the Cost of Expressiveness , 2013, HotOS.

[20] Erik Tollerud,et al. Software Use in Astronomy: an Informal Survey , 2015, ArXiv.

[21] Scott Shenker,et al. Disk-Locality in Datacenter Computing Considered Irrelevant , 2011, HotOS.

[22] Michael Isard,et al. Scalability! But at what COST? , 2015, HotOS.

[23] Liang Dong,et al. Starfish: A Self-tuning System for Big Data Analytics , 2011, CIDR.

[24] Eyal de Lara,et al. SnowFlock: rapid virtual machine cloning for cloud computing , 2009, EuroSys '09.

[25] Frank Dabek,et al. Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[26] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[27] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28] Patrick Wendell,et al. Sparrow: distributed, low latency scheduling , 2013, SOSP.

[29] Randy H. Katz,et al. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[30] Scott Shenker,et al. Network Requirements for Resource Disaggregation , 2016, OSDI.

[31] Michael J. Franklin,et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[32] Andrea C. Arpaci-Dusseau,et al. Serverless Computation with OpenLambda , 2016, HotCloud.

[33] Anirudh Sivaraman,et al. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads , 2017, NSDI.

[34] Nicholas Carriero,et al. Linda in context , 1989, CACM.

[35] Scott Shenker,et al. The Case for Tiny Tasks in Compute Clusters , 2013, HotOS.

[36] Muthu Dayalan,et al. MapReduce : Simplified Data Processing on Large Cluster , 2018 .

[37] Michael Abd-El-Malek,et al. Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[38] Scott Shenker,et al. Network support for resource disaggregation in next-generation datacenters , 2013, HotNets.

[39] Jinyang Li,et al. Building fast, distributed programs with partitioned tables , 2010 .

[40] Fred Douglis,et al. Transparent process migration: Design alternatives and the sprite implementation , 1991, Softw. Pract. Exp..

[41] Andrew V. Goldberg,et al. Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[42] Anton van den Hengel,et al. Image-Based Recommendations on Styles and Substitutes , 2015, SIGIR.

[43] Michael Isard,et al. DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.