Enumerating tree-like chemical graphs with given upper and lower bounds on path frequencies

BackgroundEnumeration of chemical graphs satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics since it leads to a variety of useful applications including structure determination of novel chemical compounds and drug design.ResultsIn this paper, we consider the problem of enumerating all tree-like chemical graphs from a given set of feature vectors, which is specified by a pair of upper and lower feature vectors, where a feature vector represents the frequency of prescribed paths in a chemical compound to be constructed. This problem can be solved by applying the algorithm proposed by Ishida et al. to each single feature vector in the given set, but this method may take much computation time because in general there are many feature vectors in a given set. We propose a new exact branch-and-bound algorithm for the problem so that all the feature vectors in a given set are handled directly. Since we cannot use the bounding operation proposed by Ishida et al. due to upper and lower constraints, we introduce new bounding operations based on upper and lower feature vectors, a bond constraint, and a detachment condition.ConclusionsOur proposed algorithm is useful for enumerating tree-like chemical graphs with given upper and lower bounds on path frequencies.

[1]  A. Cayley On the analytical forms called Trees, with application to the theory of chemical combinations , 2009 .

[2]  Harald Mauser,et al.  Chemical Fragment Spaces for de novo Design , 2007, J. Chem. Inf. Model..

[3]  Jean-Loup Faulon,et al.  The signature molecular descriptor. 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides. , 2003, Journal of molecular graphics & modelling.

[4]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[5]  Shin-Ichi Nakano,et al.  Efficient Generation of Rooted Trees , 2003 .

[6]  Hiroshi Nagamochi,et al.  A Detachment Algorithm for Inferring a Graph from Path Frequency , 2009, Algorithmica.

[7]  Bruce G. Buchanan,et al.  Dendral and Meta-Dendral: Their Applications Dimension , 1978, Artif. Intell..

[8]  D. J. Klein,et al.  Formula periodic table for acyclic hydrocarbon isomer classes: combinatorially averaged graph invariants , 1999 .

[9]  Shin-Ichi Nakano,et al.  Generating Colored Trees , 2005, WG.

[10]  Jean-Louis Reymond,et al.  Virtual Exploration of the Chemical Universe up to 11 Atoms of C, N, O, F: Assembly of 26.4 Million Structures (110.9 Million Stereoisomers) and Analysis for New Ring Systems, Stereochemistry, Physicochemical Properties, Compound Classes, and Drug Discovery , 2007, J. Chem. Inf. Model..

[11]  Bernhard Schölkopf,et al.  Learning to Find Pre-Images , 2003, NIPS.

[12]  Vladimír Kvasnička,et al.  Constructive enumeration of acyclic molecules , 1991 .

[13]  Alexander Zien,et al.  Learning to Find Graph Pre-images , 2004, DAGM-Symposium.

[14]  Hiroshi Nagamochi,et al.  Enumerating Treelike Chemical Graphs with Given Path Frequency , 2008, J. Chem. Inf. Model..

[15]  George Karypis,et al.  Frequent substructure-based approaches for classifying chemical compounds , 2003, IEEE Transactions on Knowledge and Data Engineering.

[16]  Douglas J. Klein,et al.  Chemical Combinatorics for Alkane-Isomer Enumeration and More , 1998, J. Chem. Inf. Comput. Sci..

[17]  Tatsuya Akutsu,et al.  Inferring a Graph from Path Frequency , 2005, CPM.

[18]  Susumu Goto,et al.  KEGG for representation and analysis of molecular networks involving diseases and drugs , 2009, Nucleic Acids Res..

[19]  Jean-Loup Faulon,et al.  The Signature Molecular Descriptor. 2. Enumerating Molecules from Their Extended Valence Sequences , 2003, J. Chem. Inf. Comput. Sci..

[20]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[21]  Douglas J. Klein,et al.  Isomer combinatorics for acyclic conjugated polyenes: enumeration and beyond , 1999 .

[22]  Hiroshi Nagamochi,et al.  Improved algorithms for enumerating tree-like chemical graphs with given path frequency. , 2008, Genome informatics. International Conference on Genome Informatics.

[23]  K. Funatsu,et al.  Recent Advances in the Automated Structure Elucidation System, CHEMICS. Utilization of Two-Dimensional NMR Spectral Information and Development of Peripheral Functions for Examination of Candidates , 1994, J. Chem. Inf. Comput. Sci..

[24]  Tatsuya Akutsu,et al.  Graph Kernels for Molecular Structure-Activity Relationship Analysis with Support Vector Machines , 2005, J. Chem. Inf. Model..

[25]  Jens Sadowski,et al.  Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification , 2003, J. Chem. Inf. Comput. Sci..

[26]  Kimito Funatsu,et al.  Recent Advances in the Automated Structure Elucidation System, CHEMICS. Utilization of Two-Dimensional NMR Spectral Information and Development of Peripheral Functions for Examination of Candidates , 1996, J. Chem. Inf. Comput. Sci..

[27]  Lemont B. Kier,et al.  Design of molecules from quantitative structure-activity relationship models. 3. Role of higher order path counts: Path 3 , 1993, J. Chem. Inf. Comput. Sci..