Managing Data-Intensive Workloads in a Cloud

The amount of data available for many areas is increasing faster than our ability to process it. The promise of “infinite” resources given by the cloud computing paradigm has led to recent interest in exploiting clouds for large-scale data intensive computing. Data-intensive computing presents new challenges for systems management in the cloud including new processing frameworks, such as MapReduce, and costs inherent with large data sets in distributed environments. Workload management, an important component of systems management, is the discipline of effectively managing, controlling and monitoring “workflow” across computing systems. This chapter examines the state-of-the-art of workload management for data-intensive computing in clouds. A taxonomy is presented for workload management of data-intensive computing in the cloud and use the taxonomy to classify and evaluate current workload management mechanisms.

[1]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[2]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[3]  Kamesh Munagala,et al.  Modeling and exploiting query interactions in database systems , 2008, CIKM '08.

[4]  Sudipto Das,et al.  Who's Driving this Cloud? Towards Efficient Migration for Elastic and Autonomic Multitenant Databases , 2010 .

[5]  Michael Stonebraker,et al.  MapReduce and parallel DBMSs: friends or foes? , 2010, CACM.

[6]  G. Jiang,et al.  Resilient workload manager: taming bursty workload of scaling internet applications , 2009, ICAC-INDST '09.

[7]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[8]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[9]  Patrick Martin,et al.  Executing Data-Intensive Workloads in a Cloud , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[10]  Pierre America,et al.  Parallel Database Systems , 1991 .

[11]  David J. DeWitt,et al.  Clustera: an integrated computation and data management system , 2008, Proc. VLDB Endow..

[12]  Ralph Duncan,et al.  A survey of parallel computer architectures , 1990, Computer.

[13]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[14]  Ian Foster,et al.  AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis , 2006 .

[15]  Robert L. Grossman,et al.  Sector and Sphere: the design and implementation of a high-performance data cloud , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[16]  Rajkumar Buyya,et al.  Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation , 2009, CloudCom.

[17]  Patrick Martin,et al.  Towards Autonomic Workload Management in DBMSs , 2009, J. Database Manag..

[18]  Fangpeng Dong Workflow Scheduling Algorithms in the Grid , 2009 .

[19]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[20]  Matei Zaharia,et al.  Job Scheduling for Multi-User MapReduce Clusters , 2009 .

[21]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[22]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[23]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[24]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[25]  Jeffrey F. Naughton,et al.  Cache Conscious Algorithms for Relational Query Processing , 1994, VLDB.

[26]  Michael Stonebraker,et al.  A comparison of approaches to large-scale data analysis , 2009, SIGMOD Conference.

[27]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[28]  Robert Hockauf,et al.  Exploiting Spatial and Temporal Locality of Accesses: A New Hardware-Based Monitoring Approach for DSM Systems , 1998, Euro-Par.

[29]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[30]  Borja Sotomayor,et al.  Virtual Infrastructure Management in Private and Hybrid Clouds , 2009, IEEE Internet Computing.

[31]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[32]  Rafael Moreno-Vozmediano,et al.  Elastic management of cluster-based services in the cloud , 2009, ACDC '09.

[33]  Radu Prodan,et al.  A survey and taxonomy of infrastructure as a service and web hosting cloud providers , 2009, 2009 10th IEEE/ACM International Conference on Grid Computing.

[34]  Rajkumar Buyya,et al.  A taxonomy of scientific workflow systems for grid computing , 2005, SGMD.

[35]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[36]  GhemawatSanjay,et al.  The Google file system , 2003 .

[37]  Jingren Zhou,et al.  SCOPE: easy and efficient parallel processing of massive data sets , 2008, Proc. VLDB Endow..

[38]  Yingwei Luo,et al.  Live and incremental whole-system migration of virtual machines using block-bitmap , 2008, 2008 IEEE International Conference on Cluster Computing.

[39]  Kevin Wilkinson,et al.  Managing long-running queries , 2009, EDBT '09.

[40]  Dejan S. Milojicic,et al.  Process migration , 1999, ACM Comput. Surv..

[41]  Naveen Sharma,et al.  Towards autonomic workload provisioning for enterprise Grids and clouds , 2009, 2009 10th IEEE/ACM International Conference on Grid Computing.

[42]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[43]  Craig D. Weissman,et al.  The design of the force.com multitenant internet application development platform , 2009, SIGMOD Conference.

[44]  Henry Li Introducing Windows Azure , 2009 .

[45]  Rajkumar Buyya,et al.  A taxonomy of Data Grids for distributed data sharing, management, and processing , 2005, CSUR.

[46]  Reagan Moore,et al.  Data-intensive computing and digital libraries , 1998, CACM.

[47]  Edward Walker,et al.  Creating personal adaptive clusters for managing scientific jobs in a distributed computing environment , 2006, 2006 IEEE Challenges of Large Applications in Distributed Environments.

[48]  Samuel T. Chanson,et al.  Process groups and group communications: classifications and requirements , 1990, Computer.

[49]  Jim Gray,et al.  Distributed Computing Economics , 2004, ACM Queue.

[50]  Selim G. Akl,et al.  Scheduling Algorithms for Grid Computing: State of the Art and Open Problems , 2006 .

[51]  Jeffrey S. Chase,et al.  Proceedings of the 1st workshop on Automated control for datacenters and clouds , 2009, ICAC 2009.

[52]  Jeffrey S. Chase,et al.  Automated control for elastic storage , 2010, ICAC '10.

[53]  Eugene Ciurana,et al.  Google App Engine , 2009 .

[54]  Rajkumar Buyya,et al.  A Survey of Scheduling and Management Techniques for Data-Intensive Application Workflows , 2012 .

[55]  Gordon S. Blair,et al.  A generic component model for building systems software , 2008, TOCS.

[56]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.

[57]  Chau-Wen Tseng,et al.  Improving data locality with loop transformations , 1996, TOPL.

[58]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[59]  Ian Watson,et al.  The Manchester prototype dataflow computer , 1985, CACM.

[60]  Joseph Boykin,et al.  Guest Editor's Introduction: Recen Developments in Operating Systems , 1990 .

[61]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[62]  Antoine Vernois,et al.  Simultaneous Scheduling of Replication and Computation for Data-Intensive Applications on the Grid , 2005, Journal of Grid Computing.

[63]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..