Streamline: scheduling streaming applications in a wide area environment

Scheduling a streaming application on high-performance computing (HPC) resources has to be sensitive to the computation and communication needs of each stage of the application dataflow graph to ensure QoS criteria such as latency and throughput. Since the grid has evolved out of traditional high-performance computing, the tools available for scheduling are more appropriate for batch-oriented applications. Our scheduler, called Streamline, considers the dynamic nature of the grid and runs periodically to adapt scheduling decisions using application requirements (per-stage computation and communication needs), application constraints (such as co-location of stages), and resource availability. The performance of Streamline is compared with an Optimal placement, Simulated Annealing (SA) approximations, and E-Condor, a streaming grid scheduler built using Condor. For kernels of streaming applications, we show that Streamline performs close to the Optimal and SA algorithms, and an order of magnitude better than E-Condor under non-uniform load conditions. We also conduct scalability studies showing the advantage of Streamline over other approaches. Furthermore, we implement Streamline on Planetlab as a grid service and demonstrate that it performs close to SA algorithm under dynamic resource conditions.

[1]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[2]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[3]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[4]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[5]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[6]  R. j. m. Boer Resource Management in the Condor System , 1996 .

[7]  Tao Yang,et al.  A Comparison of Clustering Heuristics for Scheduling Directed Acycle Graphs on Multiprocessors , 1992, J. Parallel Distributed Comput..

[8]  C. V. Ramamoorthy,et al.  Optimal Scheduling Strategies in a Multiprocessor System , 1972, IEEE Transactions on Computers.

[9]  C. V. Ramamoorthy,et al.  System Modeling and Testing Procedures for Microdiagnostics , 1972, IEEE Transactions on Computers.

[10]  Richard Wolski,et al.  Dynamically forecasting network performance using the Network Weather Service , 1998, Cluster Computing.

[11]  Ian T. Foster,et al.  Grid information services for distributed resource sharing , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[12]  David E. Culler,et al.  Operating Systems Support for Planetary-Scale Network Services , 2004, NSDI.

[13]  Warren Smith,et al.  A Resource Management Architecture for Metacomputing Systems , 1998, JSSPP.

[14]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[15]  Jarek Nabrzyski,et al.  Grid resource management: state of the art and future trends , 2004 .

[16]  Salim Hariri,et al.  Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing , 2002, IEEE Trans. Parallel Distributed Syst..

[17]  Umakishore Ramachandran,et al.  Streamline: a scheduling heuristic for streaming applications on the grid , 2006, Electronic Imaging.

[18]  Klara Nahrstedt,et al.  Service composition for advanced multimedia applications , 2005, IS&T/SPIE Electronic Imaging.

[19]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[20]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[21]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[22]  Umakishore Ramachandran,et al.  Middleware Guidelines for Future Sensor Networks , 2004 .

[23]  Vanish Talwar,et al.  An environment for enabling interactive grids , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[24]  Michael Stonebraker,et al.  Contract-Based Load Management in Federated Distributed Systems , 2004, NSDI.

[25]  David Abramson,et al.  Nimrod/G: an architecture for a resource management and scheduling system in a global computational grid , 2000, Proceedings Fourth International Conference/Exhibition on High Performance Computing in the Asia-Pacific Region.

[26]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[27]  John F. Karpovich,et al.  The Legion Resource Management System , 1999, JSSPP.

[28]  Ian T. Foster,et al.  Condor-G: A Computation Management Agent for Multi-Institutional Grids , 2004, Cluster Computing.

[29]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[30]  Klara Nahrstedt,et al.  SpiderNet: an integrated peer-to-peer service composition framework , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[31]  Liang Chen,et al.  Resource allocation in a middleware for streaming data , 2004, MGC '04.

[32]  T. C. Hu Parallel Sequencing and Assembly Line Problems , 1961 .

[33]  Liang Chen,et al.  GATES: a grid-based middleware for processing distributed data streams , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..