Grid Computing Workloads

In the mid 1990s, the grid computing community promised the "compute power grid," a utility computing infrastructure for scientists and engineers. Since then, a variety of grids have been built worldwide, for academic purposes, specific application domains, and general production work. Understanding grid workloads is important for the design and tuning of future grid resource managers and applications, especially in the recent wake of commercial grids and clouds. This article presents an overview of the most important characteristics of grid workloads in the past seven years (2003-2010). Although grid user populations range from tens to hundreds of individuals, a few users dominate each grid's workload both in terms of consumed resources and the number of jobs submitted to the system. Real grid workloads include very few parallel jobs but many independent single-machine jobs (tasks) grouped into single "bags of tasks."

[1]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[2]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[3]  Alexandru Iosup,et al.  The Characteristics and Performance of Groups of Jobs in Grids , 2007, Euro-Par.

[4]  Alexandru Iosup,et al.  The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[5]  Radu Prodan,et al.  ON THE CHARACTERISTICS OF GRID WORKFLOWS , 2008 .

[6]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[7]  Alexandru Iosup,et al.  The performance of bags-of-tasks in large-scale distributed systems , 2008, HPDC '08.

[8]  Andrea C. Arpaci-Dusseau,et al.  Pipeline and batch sharing in grid workloads , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[9]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[10]  Assaf Schuster,et al.  GridBot: execution of bags of tasks in multiple grids , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[11]  Ian J. Taylor,et al.  Workflows and e-Science: An overview of workflow system features and capabilities , 2009, Future Gener. Comput. Syst..

[12]  Adriana Iamnitchi,et al.  Filecules in High-Energy Physics: Characteristics and Impact on Resource Management , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[13]  Yong Zhao,et al.  Falkon: a Fast and Light-weight tasK executiON framework , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[14]  Alexandru Iosup,et al.  A performance study of grid workflow engines , 2008, 2008 9th IEEE/ACM International Conference on Grid Computing.

[15]  Dick H. J. Epema,et al.  On the benefit of processor coallocation in multicluster grid systems , 2010 .