论文信息 - Clique-detection models in computational biochemistry and genomics

Clique-detection models in computational biochemistry and genomics

Abstract Many important problems arising in computational biochemistry and genomics have been formulated in terms of underlying combinatorial optimization models. In particular, a number have been formulated as clique-detection models. The proposed article includes an introduction to the underlying biochemistry and genomic aspects of the problems as well as to the graph-theoretic aspects of the solution approaches. Each subsequent section describes a particular type of problem, gives an example to show how the graph model can be derived, summarizes recent progress, and discusses challenges associated with solving the associated graph-theoretic models. Clique-detection models include prescribing (a) a maximal clique, (b) a maximum clique, (c) a maximum weighted clique, or (d) all maximal cliques in a graph. The particular types of biochemistry and genomics problems that can be represented by a clique-detection model include integration of genome mapping data, nonoverlapping local alignments, matching and comparing molecular structures, and protein docking.

Wilbert E. Wilhelm | Sergiy Butenko | W. Wilhelm | S. Butenko

[1] Etsuji Tomita,et al. A Simple Algorithm for Finding a Maximum Clique and Its Worst-Case Time Complexity , 1990, Systems and Computers in Japan.

[2] J. Jeffry Howbert,et al. The Maximum Clique Problem , 2007 .

[3] A. Lesk. COMPUTATIONAL MOLECULAR BIOLOGY , 1988, Proceeding of Data For Discovery.

[4] Luitpold Babel. Finding maximum cliques in arbitrary and in special graphs , 2005, Computing.

[5] M. Trick,et al. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Workshop, October 11-13, 1993 , 1996 .

[6] Egon Balas,et al. Finding a Maximum Clique in an Arbitrary Graph , 1986, SIAM J. Comput..

[7] A. Ghose,et al. Geometrically feasible binding modes of a flexible ligand molecule at the receptor site , 1985 .

[8] P Willett,et al. Using a genetic algorithm to identify common structural features in sets of ligands. , 1997, Journal of molecular graphics & modelling.

[9] P. Pardalos,et al. An exact algorithm for the maximum clique problem , 1990 .

[10] Peter Willett,et al. Graph-Theoretic Techniques for Macromolecular Docking , 2000, J. Chem. Inf. Comput. Sci..

[11] Michel Gendreau,et al. An Efficient Implicit Enumeration Algorithm for the Maximum Clique Problem , 1988 .

[12] A. Godzik,et al. Topology fingerprint approach to the inverse protein folding problem. , 1992, Journal of molecular biology.

[13] E. Lander,et al. Genomic mapping by anchoring random clones: a mathematical analysis. , 1991, Genomics.

[14] Shigenori Maeda,et al. Automated recognition of common geometrical patterns among a variety of three-dimensional moleculars structures , 1987 .

[15] John Bradshaw,et al. Similarity Searching Using Reduced Graphs , 2003, J. Chem. Inf. Comput. Sci..

[16] Valeriĭ Efimovich Golender,et al. Logical and combinatorial algorithms for drug design , 1983 .

[17] Johan Håstad,et al. Clique is hard to approximate within n/sup 1-/spl epsiv// , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[18] Peter Willett,et al. CLIP: Similarity Searching of 3D Databases Using Clique Detection , 2003, J. Chem. Inf. Comput. Sci..

[19] Paola Bonizzoni,et al. The Haplotyping problem: An overview of computational models and solutions , 2003, Journal of Computer Science and Technology.

[20] R Samudrala,et al. Handling context‐sensitivity in protein structures using graph theory: Bona fide prediction , 1997, Proteins.

[21] P. Pardalos,et al. Handbook of Combinatorial Optimization , 1998 .

[22] R Samudrala,et al. A graph-theoretic algorithm for comparative modeling of protein structure. , 1998, Journal of molecular biology.

[23] Kengo Kinoshita,et al. Probabilistic description of protein alignments for sequences and structures , 2004, Proteins.

[24] Chris Sander,et al. The HSSP database of protein structure-sequence alignments , 1993, Nucleic Acids Res..

[25] Ali E. Abbas,et al. Bioinformatics and Management Science: Some Common Tools and Techniques , 2004, Oper. Res..

[26] Volker Heun,et al. Approximate protein folding in the HP side chain model on extended cubic lattices , 1999, Discret. Appl. Math..

[27] P Willett,et al. Identification of tertiary structure resemblance in proteins using a maximal common subgraph isomorphism algorithm. , 1993, Journal of molecular biology.

[28] M. D. Frank-Kamenet︠s︡kiĭ,et al. Unraveling DNA : the most important molecule of life , 1997 .

[29] J. Håstad. Clique is hard to approximate within n 1-C , 1996 .

[30] Michael A. Langston,et al. High performance computational tools for Motif discovery , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[31] Pavel A. Pevzner,et al. Combinatorial Approaches to Finding Subtle Signals in DNA Sequences , 2000, ISMB.

[32] Lan Lin,et al. A Combinatorial Approach to the Analysis of Differential Gene Expression Data , 2005 .

[33] Eric Harley,et al. Revealing hidden interval graph structure in STS-content data , 1999, Bioinform..

[34] Giuseppe Avondo Bodino,et al. Economic applications of the theory of graphs , 1962 .

[35] Eleanor J. Gardiner,et al. Clique-detection algorithms for matching three-dimensional molecular structures. , 1997, Journal of molecular graphics & modelling.

[36] F. Crick,et al. Genetical Implications of the Structure of Deoxyribonucleic Acid , 1953, Nature.

[37] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[38] Peter Willett,et al. Algorithms for the identification of three-dimensional maximal common substructures , 1987, J. Chem. Inf. Comput. Sci..

[39] C. Bron,et al. Algorithm 457: finding all cliques of an undirected graph , 1973 .

[40] Tatsuya Akutsu,et al. Point matching under non-uniform distortions and protein side chain packing based on an efficient maximum clique algorithm. , 2002, Genome informatics. International Conference on Genome Informatics.

[41] Panos M. Pardalos,et al. On maximum clique problems in very large graphs , 1999, External Memory Algorithms.

[42] R. Ravi,et al. Nonoverlapping Local Alignments (weighted Independent Sets of Axis-parallel Rectangles) , 1996, Discret. Appl. Math..

[43] A. Nagurney. Innovations in Financial and Economic Networks , 2003 .

[44] Chris Sander,et al. GeneQuiz: A Workbench for Sequence Analysis , 1994, ISMB.

[45] Piotr Berman,et al. A d/2 Approximation for Maximum Weight Independent Set in d-Claw Free Graphs , 2000, Nord. J. Comput..

[46] Zvi Galil,et al. Proceedings of the 30th IEEE symposium on Foundations of computer science , 1994, FOCS 1994.

[47] Pavel A. Pevzner,et al. Computational molecular biology : an algorithmic approach , 2000 .

[48] R. Carr,et al. Branch-and-Cut Algorithms for Independent Set Problems: Integrality Gap and An Application to Protein Structure Alignment , 2000 .

[49] Fred R. McMorris,et al. On Probe Interval Graphs , 1998, Discret. Appl. Math..

[50] G. Tintner,et al. Economic Applications of the Theory of Graphs. , 1963 .

[51] Eric Harley,et al. Uniform integration of genome mapping data using intersection graphs , 2001, Bioinform..

[52] Christus,et al. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[53] Russell Schwartz,et al. SNPs Problems, Complexity, and Algorithms , 2001, ESA.

[54] Yvonne C. Martin,et al. A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists , 1993, J. Comput. Aided Mol. Des..

[55] Faisal N. Abu-Khzam,et al. Scalable parallel algorithms for difficult combinatorial problems: A case study in optimization , 2004, Parallel and Distributed Computing and Networks.

[56] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[57] J. Moon,et al. On cliques in graphs , 1965 .

[58] D. K. Friesen,et al. A combinatorial algorithm for calculating ligand binding , 1984 .

[59] Yoshimasa Takahashi,et al. SS3D-P2: a three dimensional substructure search program for protein motifs based on secondary structure elements , 1997, Comput. Appl. Biosci..

[60] John Moult,et al. Molecular modeling of protein function regions , 2004, Proteins.

[61] Harvey J. Greenberg,et al. Opportunities for Combinatorial Optimization in Computational Biology , 2004, INFORMS J. Comput..

[62] Christos H. Papadimitriou,et al. Algorithmic aspects of protein structure similarity , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[63] Jacek Blazewicz,et al. Selected combinatorial problems of computational biology , 2005, Eur. J. Oper. Res..

[64] Vijay Chandru,et al. The algorithmics of folding proteins on lattices , 2003, Discret. Appl. Math..

[65] J. Håstad. Clique is hard to approximate withinn1−ε , 1999 .