Clique-detection algorithms for matching three-dimensional molecular structures.

The representation of chemical and biological molecules by means of graphs permits the use of a maximum common subgraph (MCS) isomorphism algorithm to identify the structural relationships existing between pairs of such molecular graphs. Clique detection provides an efficient way of implementing MCS detection, and this article reports a comparison of several different clique-detection algorithms when used for this purpose. Experiments with both small molecules and proteins demonstrate that the most efficient of these particular applications, which typically involve correspondence graphs with low edge densities, is the algorithm described by Carraghan and Pardalos. This is shown to be two to three times faster than the Bron-Kerbosch algorithm that has been used previously for MCS applications in chemistry and biology. However, the latter algorithm enables all substructures common to a pair of molecules to be identified, and not just the largest ones, as with the other algorithms considered here. The two algorithms can usefully be combined to increase the efficiency of database-searching systems that use the MCS as a measure of structural similarity.

[1]  Rikio Onai,et al.  Proposal and Evaluation of Dynamic Object-Oriented Programming , 1990, Systems and Computers in Japan.

[2]  Joseph B. Moon,et al.  3D database searching and de novo construction methods in molecular design , 1990 .

[3]  Peter Willett,et al.  Upperbound procedures for the identification of similar three-dimensional chemical structures , 1989, J. Comput. Aided Mol. Des..

[4]  P Willett,et al.  Use of techniques derived from graph theory to compare secondary structure motifs in proteins. , 1990, Journal of molecular biology.

[5]  Robert E. Tarjan,et al.  Finding a Maximum Independent Set , 1976, SIAM J. Comput..

[6]  G J Williams,et al.  The Protein Data Bank: a computer-based archival file for macromolecular structures. , 1977, Journal of molecular biology.

[7]  P. Pardalos,et al.  An exact algorithm for the maximum clique problem , 1990 .

[8]  Yvonne C. Martin,et al.  A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists , 1993, J. Comput. Aided Mol. Des..

[9]  Etsuji Tomita,et al.  A Simple Algorithm for Finding a Maximum Clique and Its Worst-Case Time Complexity , 1990, Systems and Computers in Japan.

[10]  Carlo Mannino,et al.  An exact algorithm for the maximum stable set problem , 1994, Comput. Optim. Appl..

[11]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[12]  Derek G. Corneil,et al.  The graph isomorphism disease , 1977, J. Graph Theory.

[13]  J. McFarland,et al.  Comparative molecular field analysis of anticoccidial triazines. , 1992, Journal of medicinal chemistry.

[14]  Egon Balas,et al.  Finding a Maximum Clique in an Arbitrary Graph , 1986, SIAM J. Comput..

[15]  Peter Willett,et al.  PROTEP: A Program for Graph-Theoretic Similarity Searching of the 3-D Structures in the Protein Data Bank , 1995 .

[16]  Dennis H. Smith,et al.  Computer-assisted examination of compounds for common three-dimensional substructures , 1983, Journal of chemical information and computer sciences.

[17]  Chris M. W. Ho,et al.  FOUNDATION: A program to retrieve all possible structures containing a user-defined minimum number of matching query elements from three-dimensional databases , 1993, J. Comput. Aided Mol. Des..

[18]  Ramón Carbó,et al.  Molecular similarity and reactivity : from quantum chemical to phenomenological approaches , 1995 .

[19]  Peter Willett,et al.  Algorithms for the identification of three-dimensional maximal common substructures , 1987, J. Chem. Inf. Comput. Sci..

[20]  Michel Gendreau,et al.  An Efficient Implicit Enumeration Algorithm for the Maximum Clique Problem , 1988 .

[21]  Ronan Bureau,et al.  Comparative molecular field analysis of CCK-A antagonists using field-fit as an alignment technique. A convenient guide to design new CCK-A ligands , 1992, J. Comput. Aided Mol. Des..

[22]  P Willett,et al.  Three‐dimensional structural resemblance between leucine aminopeptidase and carboxypeptidase A revealed by graph‐theoretical techniques , 1992, FEBS letters.

[23]  G. Levi A note on the derivation of maximal common subgraphs of two directed or undirected graphs , 1973 .

[24]  Garland R. Marshall,et al.  3D-QSAR of angiotensin-converting enzyme and thermolysin inhibitors: A comparison of CoMFA models based on deduced and experimentally determined active site geometries , 1993 .

[25]  Peter Willett,et al.  Designing bioactive molecules : three-dimensional techniques and applications , 1998 .

[26]  Harry G. Barrow,et al.  Subgraph Isomorphism, Matching Relational Structures and Maximal Cliques , 1976, Inf. Process. Lett..

[27]  Peter Willett,et al.  A polymerase I palm in adenylyl cyclase? , 1997, Nature.

[28]  P Willett,et al.  Three‐dimensional structural resemblance between the ribonuclease H and connection domains of HIV reverse transcriptase and the ATPase fold revealed using graph theoretical techniques , 1993, FEBS letters.

[29]  P. Willett,et al.  A graph-theoretic approach to the identification of three-dimensional patterns of amino acid side-chains in protein structures. , 1994, Journal of molecular biology.

[30]  Alain Hertz,et al.  Tabaris: An exact algorithm based on tabu search for finding a maximum independent set in a graph , 1990, Comput. Oper. Res..

[31]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[32]  Georg Gati,et al.  Further annotated bibliography on the isomorphism disease , 1979, J. Graph Theory.

[33]  Peter Willett,et al.  Three-dimensional chemical structure handling , 1991 .

[34]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[35]  Fuyau Lin,et al.  A parallel computation network for the maximum clique problem , 1993, 1993 IEEE International Symposium on Circuits and Systems.

[36]  L. Lovász,et al.  Polynomial Algorithms for Perfect Graphs , 1984 .

[37]  P Willett,et al.  Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[38]  J. J. McGregor,et al.  Backtrack search algorithms and the maximal common subgraph problem , 1982, Softw. Pract. Exp..