Qos-based Algorithm for Job Allocation and Scheduling in Data Grid

Job allocation and scheduling for data transfer is a fundamental issue for achieving high performance in data grid environments. In this paper, we propose a new algorithm that combines job allocation with scheduling dynamically based on resource quality. The algorithm takes resource failure into consideration and provides a re-allocation mechanism, so it can utilize limited amounts of resources efficiently and enhance the reliability of data transfer in data grid. A definition of resource quality is given in the paper as well, which consists of information about CPU and bandwidth of the grid storage node that resource resides. To reflect historical performance of resource, a new instance of ant algorithm is designed for calculating and updating this resource quality. Based on this quality, the job allocation and scheduling algorithm can take full advantage of the high performance resources and balance the load among resources at the same time. Experimental results show that the algorithm satisfies the expectations

[1]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[2]  Lily B. Mummert,et al.  Using a utility computing framework to develop utility systems , 2004, IBM Syst. J..

[3]  Rik Eshuis,et al.  Comparing Petri Net and Activity Diagram Variants for Workflow Modelling - A Quest for Reactive Petri Nets , 2003, Petri Net Technology for Communication-Based Systems.

[4]  Jun Qin,et al.  Specification of grid workflow applications with AGWL: an Abstract Grid Workflow Language , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[5]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[6]  D. Hollingsworth The workflow Reference Model , 1994 .

[7]  Rajkumar Buyya,et al.  A novel architecture for realizing grid workflow using tuple spaces , 2004, Fifth IEEE/ACM International Workshop on Grid Computing.

[8]  Jizhou Sun,et al.  Ant algorithm-based task scheduling in grid computing , 2003, CCECE 2003 - Canadian Conference on Electrical and Computer Engineering. Toward a Caring and Humane Technology (Cat. No.03CH37436).

[9]  Mathilde Romberg The UNICORE architecture: seamless access to distributed resources , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[10]  H. G. Rotithor Taxonomy of dynamic task scheduling schemes in distributed computing systems , 1994 .

[11]  Kees M. van Hee,et al.  Workflow Management: Models, Methods, and Systems , 2002, Cooperative information systems.

[12]  Miron Livny,et al.  Condor: a distributed job scheduler , 2001 .

[13]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[14]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[15]  Bertram Ludäscher,et al.  Compiling abstract scientific workflows into Web service workflows , 2003, 15th International Conference on Scientific and Statistical Database Management, 2003..

[16]  P. R. Moran,et al.  The Pioneers of NMR and Magnetic Resonance in Medicine: The Story of MRI , 1997 .

[17]  Carole A. Goble,et al.  myGrid: personalised bioinformatics on the information grid , 2003, ISMB.

[18]  Javier Jaén Martínez,et al.  Data Management in an International Data Grid Project , 2000, GRID.

[19]  Gregor von Laszewski,et al.  GSFL: A Workflow Framework for Grid Services , 2002 .

[20]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[21]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[22]  Jeffrey G. Gray,et al.  A Graphical Modeling Environment for the Generation of Workflows for the Globus Toolkit , 2005 .

[23]  John Darlington,et al.  Scheduling Architecture and Algorithms within the ICENI Grid Middleware , 2003 .

[24]  Wil M. P. van der Aalst,et al.  XRL/Flower: Supporting Inter-organizational Workflows Using XML/Petri-Net Technology , 2002, WES.

[25]  Amit P. Sheth,et al.  Modeling Quality of Service for Workflows and Web Service Processes , 2002 .

[26]  M. Shields,et al.  Chapter 1 RESOURCE MANAGEMENT OF TRIANA P2P SERVICES , 2003 .

[27]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..

[28]  Keqin Li,et al.  Job scheduling for grid computing on metacomputers , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[29]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[30]  John Shalf,et al.  Enabling Applications on the Grid: A Gridlab Overview , 2003, Int. J. High Perform. Comput. Appl..

[31]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[32]  Radu Prodan,et al.  Dynamic scheduling of scientific workflow applications on the grid: a case study , 2005, SAC '05.

[33]  John Darrell Van Horn Online Availability of fMRI Results Images , 2003, Journal of Cognitive Neuroscience.

[34]  Holly Dail,et al.  A Modular Framework for Adaptive Scheduling in Grid Application Development Environments , 2002 .

[35]  Ian T. Foster,et al.  Data management and transfer in high-performance computational grid environments , 2002, Parallel Comput..

[36]  Leon Sterling,et al.  Quality of service for web services , 2004 .

[37]  Jeffrey D. Ullman,et al.  NP-Complete Scheduling Problems , 1975, J. Comput. Syst. Sci..

[38]  Ken Kennedy,et al.  Scheduling strategies for mapping application workflows onto the grid , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[39]  Marlon Dumas,et al.  UML Activity Diagrams as a Workflow Specification Language , 2001, UML.

[40]  Bo Ai,et al.  A Strategy for Data Replication in Data Grids , 2005 .

[41]  V. Taylor,et al.  DESIGN AND IMPLEMENTATION OF PROPHESY AUTOMATIC INSTRUMENTATION AND DATA ENTRY SYSTEM , 2001 .

[42]  Bertram Ludäscher,et al.  A Framework for the Design and Reuse of Grid Workflows , 2004, SAG.

[43]  Carole A. Goble,et al.  A Suite of Daml+Oil Ontologies to Describe Bioinformatics Web Services and Data , 2003, Int. J. Cooperative Inf. Syst..

[44]  Hector Garcia-Molina,et al.  Deadline Assignment in a Distributed Soft Real-Time System , 1997, IEEE Trans. Parallel Distributed Syst..

[45]  Ali Afzal,et al.  Performance Architecture within ICENI , 2004 .

[46]  Rajkumar Buyya,et al.  A Grid service broker for scheduling e‐Science applications on global data Grids , 2006, Concurr. Comput. Pract. Exp..

[47]  Andreas Hoheisel,et al.  User tools and languages for graph‐based Grid workflows , 2006, Concurr. Comput. Pract. Exp..

[48]  Graham R. Nudd,et al.  Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..

[49]  Mauricio G. C. Resende,et al.  Greedy Randomized Adaptive Search Procedures , 1995, J. Glob. Optim..

[50]  Rizos Sakellariou,et al.  A low-cost rescheduling policy for efficient mapping of workflows on grid systems , 2004, Sci. Program..

[51]  Stephen Smith,et al.  FSL: New tools for functional and structural brain image analysis , 2001, NeuroImage.

[52]  David Gelernter,et al.  Generative communication in Linda , 1985, TOPL.

[53]  Da Yuan,et al.  Solving a shortest path problem by ant algorithm , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[54]  Shanyu Zhao,et al.  Result Verification and Trust-based Scheduling in Open Peer-to-Peer Cycle Sharing Systems , 2004 .

[55]  E. James Whitehead,et al.  Web Distributed Authoring and Versioning (WebDAV) Access Control Protocol , 2004, RFC.

[56]  Sang-Min Park,et al.  Chameleon: a resource scheduler in a data grid environment , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[57]  Ewa Deelman,et al.  Transformation Catalog Design for GriPhyN , 2001 .

[58]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .

[59]  Y Takeshi,et al.  GENETIC ALGORITHMS FOR JOB-SHOP SCHEDULING PROBLEMS , 1997 .

[60]  Amit P. Sheth,et al.  Semantic E-Workflow Composition , 2003, Journal of Intelligent Information Systems.

[61]  Rajkumar Buyya,et al.  Grid Simulation Infrastructure Supporting Advance Reservation , 2004 .

[62]  Gilles Fedak,et al.  XtremWeb & Condor : sharing resources between Internet connected Condor pool , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[63]  Yolanda Gil,et al.  Pegasus: Mapping Scientific Workflows onto the Grid , 2004, European Across Grids Conference.

[64]  Anil Wipat,et al.  Experiences with e-Science workflow specification and enactment in bioinformatics , 2003 .

[65]  Soonwook Hwang,et al.  Grid workflow: a flexible failure handling framework for the grid , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[66]  Rizos Sakellariou,et al.  Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[67]  David Fernández-Baca,et al.  Allocating Modules to Processors in a Distributed System , 1989, IEEE Trans. Software Eng..

[68]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[69]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[70]  Francine Berman,et al.  The GrADS Project: Software Support for High-Level Grid Application Development , 2001, Int. J. High Perform. Comput. Appl..

[71]  Peter J. Fleming,et al.  Evolutionary algorithms in control systems engineering: a survey , 2002 .

[72]  David Abramson,et al.  The Grid Economy , 2005, Proceedings of the IEEE.

[73]  Nirwan Ansari,et al.  A Genetic Algorithm for Multiprocessor Scheduling , 1994, IEEE Trans. Parallel Distributed Syst..

[74]  David Abramson,et al.  A Computational Economy for Grid Computing and its Implementation in the Nimrod-G Resource Brok , 2001, Future Gener. Comput. Syst..

[75]  Gabriel Mateescu Quality of Service on the Grid Via Metascheduling with Resource Co-Scheduling and Co-Reservation , 2003, Int. J. High Perform. Comput. Appl..

[76]  Laxmikant V. Kalé,et al.  Simulation-Based Performance Prediction for Large Parallel Machines , 2005, International Journal of Parallel Programming.

[77]  S. Krishnan,et al.  2 XLANG : Web Services for Business Process Design , 2002 .

[78]  Yong Zhao,et al.  A notation and system for expressing and executing cleanly typed workflows on messy scientific data , 2005, SGMD.

[79]  Ali Afzal,et al.  Workflow Enactment in ICENI , 2004 .

[80]  Xingfu Wu,et al.  Using Performance Prediction to Allocate Grid Resources , 2004 .

[81]  Yolanda Gil,et al.  Workflow management in GriPhyN , 2004 .

[82]  L. Aversano,et al.  FlowManager: a workflow management system based on Petri nets , 2002, Proceedings 26th Annual International Computer Software and Applications.

[83]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[84]  Sathish S. Vadhiyar,et al.  A performance oriented migration framework for the grid , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[85]  Radu Prodan,et al.  Run-time Optimisation of Grid Workflow Applications , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[86]  Rajkumar Buyya,et al.  Economic-based Distributed Resource Management and Scheduling for Grid Computing , 2002, ArXiv.

[87]  ProdanRadu,et al.  Scheduling of scientific workflows in the ASKALON grid environment , 2005 .

[88]  Rajkumar Buyya,et al.  Peer-to-Peer Grid Computing and a .NET-Based Alchemi Framework , 2006 .

[89]  Von-Wun Soo,et al.  Market-oriented multiple resource scheduling in grid computing environments , 2005, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers).

[90]  Ramin Yahyapour,et al.  Design and evaluation of job scheduling strategies for grid computing , 2000, GRID.

[91]  C. Petri Kommunikation mit Automaten , 1962 .

[92]  Aravind Seshadri,et al.  A FAST ELITIST MULTIOBJECTIVE GENETIC ALGORITHM: NSGA-II , 2000 .

[93]  Wil M. P. van der Aalst,et al.  Advanced Workflow Patterns , 2000, CoopIS.

[94]  Warren Smith,et al.  A directory service for configuring high-performance distributed computations , 1997, Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183).

[95]  Daniel A. Reed,et al.  The Autopilot Performance-Directed Adaptive Control System , 1997 .

[96]  J. Darlington,et al.  Workflow Expression : Comparison of Spatial and Temporal Approaches , 2022 .

[97]  Heiko Ludwig,et al.  Web Service Level Agreement (WSLA) Language Specification , 2003 .

[98]  Carl Kesselman,et al.  Application-Level Resource Provisioning on the Grid , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[99]  Carl Kesselman,et al.  A provisioning model and its comparison with best-effort for performance-cost optimization in grids , 2007, HPDC '07.

[100]  Flavia Donno,et al.  Data Grid tutorials with hands-on experience , 2004, IEEE International Symposium on Cluster Computing and the Grid, 2004. CCGrid 2004..

[101]  T. Oinn,et al.  Soaplab - a unified Sesame door to analysis tools , 2003 .

[102]  Carl Kesselman,et al.  Optimizing Grid-Based Workflow Execution , 2005, Journal of Grid Computing.

[103]  Kristina Lerman,et al.  Resource allocation in the grid using reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[104]  Subhash Saini,et al.  GridFlow: workflow management for grid computing , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[105]  Duncan Dubugras Alcoba Ruiz,et al.  Extending UML activity diagram for workflow modeling in production systems , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[106]  Rajkumar Buyya,et al.  The Gridbus toolkit for service oriented grid and utility computing: an overview and status report , 2004, 1st IEEE International Workshop on Grid Economics and Business Models, 2004. GECON 2004..

[107]  Gregor von Laszewski Java CoG Kit Workflow Concepts for Scientific Experiments , 2005 .

[108]  Sajal K. Das,et al.  Optimizing QoS-Based Multicast Routing in Wireless Networks: A Multi-objective Genetic Algorithmic Approach , 2002, NETWORKING.

[109]  Yong Zhao,et al.  Grid middleware services for virtual data discovery, composition, and integration , 2004, MGC '04.

[110]  Peter A. Dinda Online prediction of the running time of tasks , 2001, SIGMETRICS '01.

[111]  Péter Kacsuk,et al.  P-GRADE: A Grid Programming Environment , 2003, Journal of Grid Computing.

[112]  C. Tham,et al.  QoS-based Scheduling of Workflow Applications on Service Grids , 2005 .

[113]  Thomas Fahringer,et al.  Teuta: Tool Support for Performance Modeling of Distributed and Parallel Applications , 2004, International Conference on Computational Science.

[114]  Wil M.P. van der Aalst Modelling and analysing workflow using a Petri-net based approach , 1994 .

[115]  R. F. Freund,et al.  Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems , 1999, Proceedings. Eighth Heterogeneous Computing Workshop (HCW'99).

[116]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[117]  Ian Horrocks,et al.  DAML+OIL: A Reason-able Web Ontology Language , 2002, EDBT.

[118]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[119]  Thomas Fahringer,et al.  Towards an UML Based Graphical Representation of Grid Workflow Applications , 2004, European Across Grids Conference.

[120]  Daniel A. Menascé,et al.  A framework for resource allocation in grid computing , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[121]  J. Mazziotta,et al.  Rapid Automated Algorithm for Aligning and Reslicing PET Images , 1992, Journal of computer assisted tomography.

[122]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[123]  David F. Snelling,et al.  UNICORE: uniform access to supercomputing as an element of electronic commerce , 1999, Future Gener. Comput. Syst..

[124]  Miron Livny,et al.  Condor and the Grid , 2003 .

[125]  Rajkumar Buyya,et al.  A taxonomy and survey of grid resource management systems for distributed computing , 2002, Softw. Pract. Exp..

[126]  Ivona Brandic,et al.  Towards Quality of Service Support for Grid Workflows , 2005, EGC.

[127]  P. Thompson,et al.  SUB-POPULATION BRAIN ATLASES , 2002 .

[128]  Keqin Li,et al.  Job scheduling and processor allocation for grid computing on metacomputers , 2005, J. Parallel Distributed Comput..

[129]  Kaizar Amin,et al.  GridAnt: a client-controllable grid workflow system , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[130]  Marco Laumanns,et al.  PISA: A Platform and Programming Language Independent Interface for Search Algorithms , 2003, EMO.

[131]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[132]  Péter Kacsuk,et al.  Workflow Support for Complex Grid Applications: Integrated and Portal Solutions , 2004, European Across Grids Conference.

[133]  Subhash Saini,et al.  ARMS: An agent-based resource management system for grid computing , 2002, Sci. Program..

[134]  Francine Berman,et al.  A Decoupled Scheduling Approach for the GrADS Program Development Environment , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[135]  Albert Y. Zomaya,et al.  Genetic Scheduling for Parallel Processor Systems: Comparative Studies and Performance Issues , 1999, IEEE Trans. Parallel Distributed Syst..

[136]  Nathalie Furmento,et al.  ICENI Dataflow and Workflow: Composition and Scheduling in Space and Time , 2003 .

[137]  Ivona Brandic,et al.  An approach for the high-level specification of QoS-aware grid workflows considering location affinity , 2006, Sci. Program..

[138]  Gregor von Laszewski,et al.  A Java commodity grid kit , 2001, Concurr. Comput. Pract. Exp..

[139]  Jeffrey G. Gray,et al.  Grid‐Flow: a Grid‐enabled scientific workflow system with a Petri‐net‐based interface , 2006, Concurr. Comput. Pract. Exp..

[140]  John Darlington,et al.  Mapping of Scientific Workflow within the e-Protein project to Distributed Resources , 2004 .

[141]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[142]  Akshai K. Aggarwal,et al.  An adaptive generalized scheduler for grid applications , 2005, 19th International Symposium on High Performance Computing Systems and Applications (HPCS'05).

[143]  Stephen A. Jarvis,et al.  Localised workload management using performance prediction and QoS contracts , 2002 .

[144]  Dingchao Li,et al.  Scheduling task graphs onto heterogeneous multiprocessors , 1994, Proceedings of TENCON'94 - 1994 IEEE Region 10's 9th Annual International Conference on: 'Frontiers of Computer Technology'.

[145]  Takeshi Yamada,et al.  Conventional Genetic Algorithm for Job Shop Problems , 1991, ICGA.

[146]  Andreas Geppert,et al.  Market-Based Workflow Management , 1998, Int. J. Cooperative Inf. Syst..

[147]  David Abramson,et al.  A case for economy grid architecture for service oriented grid computing , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[148]  Reagan Moore,et al.  Data Grids, Digital Libraries, and Persistent Archives: An Integrated Approach to Sharing, Publishing, and Archiving Data , 2005, Proceedings of the IEEE.

[149]  Edward A. Lee,et al.  Heterogeneous Modeling and Design of Control Systems , 2003 .

[150]  Jan Mendling,et al.  Business Process Execution Language for Web Services , 2006, EMISA Forum.

[151]  Peter Z. Kunszt,et al.  Giggle: A Framework for Constructing Scalable Replica Location Services , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[152]  SiegelHoward Jay,et al.  Task Matching and Scheduling in Heterogeneous Computing Environments Using a Genetic-Algorithm-Based Approach , 1997 .

[153]  Thomas Fahringer,et al.  Grid allocation and reservation - Grid capacity planning with negotiation-based advance reservation for optimized QoS , 2006, SC.

[154]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[155]  Rajkumar Buyya,et al.  A Market-Oriented Grid Directory Service for Publication and Discovery of Grid Service Providers and their Services , 2006, The Journal of Supercomputing.

[156]  Tadao Murata,et al.  Temporal Uncertainty and Fuzzy-Timing High-Level Petri Nets , 1996, Application and Theory of Petri Nets.

[157]  Daniel J. Crichton,et al.  A Science Data System Architecture for Information Retrieval , 2003, Clustering and Information Retrieval.

[158]  Stephen A. Jarvis,et al.  Performance-Aware Workflow Management for Grid Computing , 2005, Comput. J..