Reducing Redundant Search in Parallel Graph Mining Using Exceptions

This paper proposes an implementation of a parallel graph mining algorithm using a task-parallel language having exception handling features. The performance of a straightforward task-parallel implementation is poor for many practical backtrack search algorithms due to pruning employed in them, a worker may prune a useless subtree, which is pruned before traversal in the sequential search algorithms for search space reduction, after another worker starts the traversal of it resulting in a large amount of redundant search. Such redundancy will be significantly reduced by letting a worker know that the subtree which it is traversing is pruned so that it aborts the traversal. This abortion can be implemented elegantly and efficiently using the task-parallel language that has a mechanism for exception handling by which all running parallel tasks in a try block with an exception are automatically aborted. We applied this abort mechanism to the graph mining algorithm called COPINE, which is practically used for drug discovery, using the task-parallel language Tascell. As a result, we reduced the search space by 31.9% and the execution time by 27.4% in a 28-worker execution.

[1]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, LISP and Functional Programming.

[2]  Jun Sese,et al.  Identification of active biological networks and common expression conditions , 2008, 2008 8th IEEE International Conference on BioInformatics and BioEngineering.

[3]  Harold Abelson,et al.  Revised5 report on the algorithmic language scheme , 1998, SIGP.

[4]  Raymond Lo,et al.  Register promotion by sparse partial redundancy elimination of loads and stores , 1998, PLDI.

[5]  Doug Lea,et al.  A Java fork/join framework , 2000, JAVA '00.

[6]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Hiroshi Nakashima,et al.  Parallelization of Extracting Connected Subgraphs with Common Itemsets , 2014 .

[8]  Charles E. Leiserson,et al.  Programming with exceptions in JCilk , 2006, Sci. Comput. Program..

[9]  Bradley C. Kuszmaul,et al.  Massively Parallel Chess , 1994 .

[10]  D. H. Bartley,et al.  Revised4 report on the algorithmic language scheme , 1991, LIPO.

[11]  Jun Sese,et al.  Mining networks with shared items , 2010, CIKM.

[12]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  John H. Reppy,et al.  Manticore: a heterogeneous parallel language , 2007, DAMP '07.

[14]  Masahiro Yasugi,et al.  An Implementation of Exception Handling with Collateral Task Abortion , 2016, J. Inf. Process..

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  R. Kent Dybvig,et al.  Revised5 Report on the Algorithmic Language Scheme , 1986, SIGP.

[17]  Ian M. Donaldson,et al.  iRefIndex: A consolidated protein interaction database with provenance , 2008, BMC Bioinformatics.

[18]  Matteo Frigo,et al.  The implementation of the Cilk-5 multithreaded language , 1998, PLDI.

[19]  Taiichi Yuasa,et al.  Backtracking-based load balancing , 2009, PPoPP '09.

[20]  Jonathan Rees,et al.  Revised3 report on the algorithmic language scheme , 1986, SIGP.