Wide-area analytics with multiple resources

Running data-parallel jobs across geo-distributed sites has emerged as a promising direction due to the growing need for geo-distributed cluster deployment. A key difference between geo-distributed and intra-cluster jobs is the heterogeneous (and often constrained) nature of compute and network resources across the sites. We propose Tetrium, a system for multi-resource allocation in geo-distributed clusters, that jointly considers both compute and network resources for task placement and job scheduling. Tetrium significantly reduces job response time, while incorporating several other performance goals with simple control knobs. Our EC2 deployment and trace-driven simulations suggest that Tetrium improves the average job response time by up to 78% compared to existing data-locality-based solutions, and up to 55% compared to Iridium, the recently proposed geo-distributed analytics system.

[1]  Onur Mutlu,et al.  Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds , 2017, NSDI.

[2]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[3]  Aditya Akella,et al.  Altruistic Scheduling in Multi-Resource Clusters , 2016, OSDI.

[4]  Éva Tardos,et al.  Algorithm design , 2005 .

[5]  Ming Zhang,et al.  Efficiently Delivering Online Services over Integrated Infrastructure , 2016, NSDI.

[6]  Albert G. Greenberg,et al.  Scarlett: coping with skewed content popularity in mapreduce clusters , 2011, EuroSys '11.

[7]  Margarida Mamede,et al.  PIXIDA: Optimizing Data Parallel Jobs in Wide-Area Data Analytics , 2015, Proc. VLDB Endow..

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  Adam Wierman,et al.  This Paper Is Included in the Proceedings of the 11th Usenix Symposium on Networked Systems Design and Implementation (nsdi '14). Grass: Trimming Stragglers in Approximation Analytics Grass: Trimming Stragglers in Approximation Analytics , 2022 .

[10]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[11]  Joseph Y.-T. Leung,et al.  Complexity of Scheduling Parallel Task Systems , 1989, SIAM J. Discret. Math..

[12]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[13]  Srikanth Kandula,et al.  Multi-resource packing for cluster schedulers , 2014, SIGCOMM.

[14]  Amit Kumar,et al.  Order Scheduling Models: Hardness and Algorithms , 2007, FSTTCS.

[15]  Scott Shenker,et al.  The Case for Tiny Tasks in Compute Clusters , 2013, HotOS.

[16]  Minlan Yu,et al.  Scheduling jobs across geo-distributed datacenters , 2015, SoCC.

[17]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[18]  Adam Wierman,et al.  Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale , 2015, SIGCOMM.

[19]  Anirban Dasgupta,et al.  On scheduling in map-reduce and flow-shops , 2011, SPAA '11.

[20]  Ola Svensson,et al.  Minimizing the sum of weighted completion times in a concurrent open shop , 2010, Oper. Res. Lett..

[21]  Ion Stoica,et al.  Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.

[22]  Michael I. Jordan,et al.  Managing data transfers in computer clusters with orchestra , 2011, SIGCOMM.

[23]  Aditya Akella,et al.  CLARINET: WAN-Aware Optimization for Analytics Queries , 2016, OSDI.

[24]  Antony I. T. Rowstron,et al.  Decentralized task-aware scheduling for data center networks , 2014, SIGCOMM.

[25]  Carlo Curino,et al.  WANalytics: Analytics for a Geo-Distributed Data-Intensive World , 2015, CIDR.

[26]  Min Zhu,et al.  B4: experience with a globally-deployed software defined wan , 2013, SIGCOMM.

[27]  Thomas A. Roemer,et al.  A note on the complexity of the concurrent open shop problem , 2006, J. Sched..

[28]  Carlo Curino,et al.  Global Analytics in the Face of Bandwidth and Regulatory Constraints , 2015, NSDI.

[29]  Ramesh Govindan,et al.  Mapping the expansion of Google's serving infrastructure , 2013, Internet Measurement Conference.

[30]  Ion Stoica,et al.  The Power of Choice in Data-Aware Cluster Scheduling , 2014, OSDI.

[31]  Srikanth Kandula,et al.  This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Graphene: Packing and Dependency-aware Scheduling for Data-parallel Clusters G: Packing and Dependency-aware Scheduling for Data-parallel Clusters , 2022 .

[32]  Carlo Curino,et al.  WANalytics: Geo-Distributed Analytics for a Data Intensive World , 2015, SIGMOD Conference.

[33]  Ishai Menache,et al.  Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can , 2015, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[34]  Jeffrey D. Ullman,et al.  Optimizing joins in a map-reduce environment , 2010, EDBT '10.

[35]  Timos K. Sellis,et al.  On the Multiple-Query Optimization Problem , 1990, IEEE Trans. Knowl. Data Eng..

[36]  Paramvir Bahl,et al.  Low Latency Geo-distributed Data Analytics , 2015, SIGCOMM.

[37]  Benjamin Hindman,et al.  Dominant Resource Fairness: Fair Allocation of Multiple Resource Types , 2011, NSDI.

[38]  Scott Shenker,et al.  Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .

[39]  Scott Shenker,et al.  Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling , 2010, EuroSys '10.

[40]  Michael A. Bender,et al.  An Efficient Approximation Algorithm for Minimizing Makespan on Uniformly Related Machines , 1998, IPCO.

[41]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[42]  Magdalena Balazinska,et al.  SkewTune: mitigating skew in mapreduce applications , 2012, SIGMOD Conference.

[43]  Srikanth Kandula,et al.  PACMan: Coordinated Memory Caching for Parallel Jobs , 2012, NSDI.

[44]  Philip S. Yu,et al.  Approximate algorithms scheduling parallelizable tasks , 1992, SPAA '92.

[45]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[46]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2015, SIGCOMM.

[47]  Scott Shenker,et al.  Choosy: max-min fair sharing for datacenter jobs with constraints , 2013, EuroSys '13.

[48]  Vyas Sekar,et al.  Via: Improving Internet Telephony Call Quality Using Predictive Relay Selection , 2016, SIGCOMM.

[49]  Dan Suciu,et al.  Parallel evaluation of conjunctive queries , 2011, PODS.