Balanced Multi-process Parallel Algorithm for Chemical Compound Inference with Given Path Frequencies

To enumerate chemical compounds with given path frequencies is a fundamental procedure in Chemo- and Bio-informatics. The applications include structure determination, novel molecular development, etc. The problem complexity has been proven as NP-hard. Many methods have been proposed to solve this problem. However, most of them are heuristic algorithms. Fujiwara et al. propose a sequential branch-and-bound algorithm. Although it reaches all solutions and avoids exhaustive searching, the computation time still increases significantly when the number of atoms increases. Hence, in this paper, a parallel algorithm is presented for solving this problem. The experimental results showed that computation time was reduced even when more processes were launched. Moreover, the speed-up ratio for most of the test cases was satisfactory and, furthermore, it showed potential for use in drug design.

[1]  M. Stahl,et al.  Chemical Fragment Spaces for de novo Design. , 2007 .

[2]  Benjamin W. Wah,et al.  Efficient Branch-and-Bound Algorithms on a Two-Level Memory System , 1988, IEEE Trans. Software Eng..

[3]  Shin-Ichi Nakano,et al.  Generating Colored Trees , 2005, WG.

[4]  Lemont B. Kier,et al.  Design of molecules from quantitative structure-activity relationship models. 3. Role of higher order path counts: Path 3 , 1993, J. Chem. Inf. Comput. Sci..

[5]  C. Jordan Sur les assemblages de lignes. , 1869 .

[6]  Hiroshi Nagamochi,et al.  Enumerating Treelike Chemical Graphs with Given Path Frequency , 2008, J. Chem. Inf. Model..

[7]  Jean-Loup Faulon,et al.  The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences , 2003, J. Chem. Inf. Comput. Sci..

[8]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[9]  Tatsuya Akutsu,et al.  Inferring a Graph from Path Frequency , 2005, CPM.

[10]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[11]  Bruce G. Buchanan,et al.  Dendral and Meta-Dendral: Their Applications Dimension , 1978, Artif. Intell..

[12]  Jean-Loup Faulon,et al.  The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides. , 2003, Journal of molecular graphics & modelling.

[13]  Kimito Funatsu,et al.  Recent Advances in the Automated Structure Elucidation System, CHEMICS. Utilization of Two-Dimensional NMR Spectral Information and Development of Peripheral Functions for Examination of Candidates , 1996, J. Chem. Inf. Comput. Sci..

[14]  Brendan D. McKay,et al.  Constant Time Generation of Free Trees , 1986, SIAM J. Comput..

[15]  Weiliang Zhu,et al.  Neuraminidase pharmacophore model derived from diverse classes of inhibitors. , 2006, Bioorganic & medicinal chemistry letters.

[16]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[17]  K. Funatsu,et al.  Recent Advances in the Automated Structure Elucidation System, CHEMICS. Utilization of Two-Dimensional NMR Spectral Information and Development of Peripheral Functions for Examination of Candidates , 1994, J. Chem. Inf. Comput. Sci..