Decentralized Load Balancing for Highly Irregular Search Problems

In this paper, we present a Dynamic Load Balancing (DLB) policy for problems characterized by a highly irregular search tree, whereby no reliable workload prediction is available. DLB approaches based on global statistics are known to provide optimal load balancing performance, while randomized techniques provide high scalability. The proposed method combines both advantages and adopts distributed job-pools and a randomized polling policy. The method has been successfully adopted in a parallel search algorithm for sugbraph mining. The work load distribution process of the parallel application is based on a dynamic partitioning of the search space and a peer-to-peer communication framework. The effectiveness of the DLB method has been evaluated on a molecular biology dataset. The distributed application with the novel DLB method has shown good scalability and close-to linear speedup in a distributed network of workstations. Moreover, fault tolerance and dynamic resource aggregation make it suitable for largescale, multi-domain, heterogeneous environments, such as computational Grids.

[1]  Raj Jain,et al.  A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[2]  Richard M. Karp,et al.  A randomized parallel branch-and-bound procedure , 1988, STOC '88.

[3]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[4]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[5]  Katherine Yelick,et al.  Randomized load balancing for tree-structured computation , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[6]  M. Boyd,et al.  New soluble-formazan assay for HIV-1 cytopathic effects: application to high-flux screening of synthetic and natural products for AIDS-antiviral activity. , 1989, Journal of the National Cancer Institute.

[7]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[8]  Eugene M. Luks,et al.  Isomorphism of graphs of bounded valence can be tested in polynomial time , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[9]  Giuseppe Di Fatta,et al.  Distributed Mining of Molecular Fragments , 2004 .

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  G. Karypis,et al.  Frequent sub-structure-based approaches for classifying chemical compounds , 2005, Third IEEE International Conference on Data Mining.

[12]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[13]  Steven Skiena,et al.  Implementing discrete mathematics - combinatorics and graph theory with Mathematica , 1990 .

[14]  George Karypis,et al.  Automated Approaches for Classifying Structures , 2002, BIOKDD.

[15]  Udi Manber,et al.  DIB—a distributed implementation of backtracking , 1987, TOPL.

[16]  Giuseppe Di Fatta,et al.  Context-Aware Visual Exploration of Molecular Datab , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[17]  Giuseppe Di Fatta,et al.  High Performance Subgraph Mining in Molecular Compounds , 2005, HPCC.

[18]  Giuseppe Di Fatta,et al.  A Hierarchical Distributed Approach for Mining Molecular Fragments , 2006 .

[19]  Vipin Kumar,et al.  Scalable Load Balancing Techniques for Parallel Computers , 1994, J. Parallel Distributed Comput..

[20]  Rizos Sakellariou,et al.  Compile-time minimisation of load imbalance in loop nests , 1997, ICS '97.

[21]  Yongwha Chung,et al.  An Asynchronous Algorithm for Balancing Unpredictable Workload on Distributed-Memory Machines , 1998 .

[22]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[23]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[24]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..