Resource Allocation for Distributed Streaming Applications

We consider resource allocation for distributed streaming applications running in a grid environment, where continuously streaming data needs to be aggregated and processed to produce output streams. Because such an application comprises a pipeline of processing stages, both communication and computational requirements need to be taken into account while performing resource allocation. In this paper, we give a rigorous formulation of this resource allocation problem, based on the DAG representation of the application as well as the environment. We have shown how we can use the notion of subgraph isomorphism and developed an effective resource allocation algorithm. The main observations from the experiments we conducted to evaluate our algorithms were as follows: the overhead caused by our algorithm is comparable to an existing algorithm, Streamline, which is based onheuristics. At the same time, the application performance was improved by 30% on average. When compared to the allocation performed by the optimal algorithm, which enumerates all mappings, the application performance with our algorithm was within 4%. At the same time, unlike the optimal algorithm, our algorithm scaled well to large graphs.

[1]  Rajeev Motwani,et al.  Operator scheduling in data stream systems , 2004, VLDB 2004.

[2]  Jussi Kangasharju,et al.  Object replication strategies in content distribution networks , 2002, Comput. Commun..

[3]  David Eppstein,et al.  The Polyhedral Approach to the Maximum Planar Subgraph Problem: New Chances for Related Problems , 1994, GD.

[4]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[5]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[6]  Subhash Suri,et al.  Bandwidth-Constrained Allocation in Grid Computing , 2003, Algorithmica.

[7]  R. Levinson PATTERN ASSOCIATIVITY AND THE RETRIEVAL OF SEMANTIC NETWORKS , 1991 .

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Viktor K. Prasanna,et al.  Bandwidth-aware resource allocation for heterogeneous computing systems to maximize throughput , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[10]  Umakishore Ramachandran,et al.  Streamline: a scheduling heuristic for streaming applications on the grid , 2006, Electronic Imaging.

[11]  Randy H. Katz,et al.  Efficient and adaptive Web replication using content clustering , 2003, IEEE J. Sel. Areas Commun..

[12]  Liang Chen,et al.  GATES: a grid-based middleware for processing distributed data streams , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[13]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[14]  Viktor K. Prasanna,et al.  Greedy Heuristics for Resource Allocation in Dynamic Distributed Real-Time Heterogeneous Computing Systems , 2002, PDPTA.

[15]  E. K. WONG,et al.  Model matching in robot vision by subgraph isomorphism , 1992, Pattern Recognit..

[16]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Liang Chen,et al.  Supporting self-adaptation in streaming data mining applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[18]  Ao Tang,et al.  Distributed Resource Allocation for Stream Data Processing , 2006, HPCC.

[19]  Horst Bunke,et al.  Similarity Measures for Structured Representations , 1993, EWCBR.

[20]  Michael Anthony Bauer,et al.  Towards efficient resource allocation in distributed systems management , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[21]  Scott Klasky,et al.  Experiments with in-transit processing for data intensive grid workflows , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[22]  John G. Apostolopoulos,et al.  On multiple description streaming with content delivery networks , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[23]  Horst Bunke,et al.  Efficient Subgraph Isomorphism Detection: A Decomposition Approach , 2000, IEEE Trans. Knowl. Data Eng..

[24]  P. Foggia,et al.  Performance evaluation of the VF graph matching algorithm , 1999, Proceedings 10th International Conference on Image Analysis and Processing.

[25]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[26]  Kai-Ti Huang,et al.  A novel graph algorithm for circuit recognition , 1995, Proceedings of ISCAS'95 - International Symposium on Circuits and Systems.