Using Dominances for Solving the Protein Family Identification Problem

Identification of protein families is a computational biology challenge that needs efficient and reliable methods. Here we introduce the concept of dominance and propose a novel combined approach based on Distance Alignment Search Tool (DAST), which contains an exact algorithm with bounds. Our experiments show that this method successfully finds the most similar proteins in a set without solving all instances.

[1]  Rumen Andonov,et al.  Maximum Cliques in Protein Structure Comparison , 2009, SEA.

[2]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[3]  C. Orengo,et al.  Protein families and their evolution-a structural perspective. , 2005, Annual review of biochemistry.

[4]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[5]  Rumen Andonov,et al.  Comparing Protein 3D Structures Using A_purva , 2010 .

[6]  Robert D. Carr,et al.  1001 Optimal PDB Structure Alignments: Integer Programming Methods for Finding the Maximum Contact Map Overlap , 2004, J. Comput. Biol..

[7]  Lazaros Mavridis,et al.  SHREC'10 Track: Protein Models , 2010 .

[8]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[9]  Lazaros Mavridis,et al.  SHREC'10 Track: Protein Model Classification , 2010, 3DOR@Eurographics.

[10]  Adam Godzik,et al.  Flexible algorithm for direct multiple alignment of protein structures and sequences , 1994, Comput. Appl. Biosci..

[11]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[12]  Piero Fariselli,et al.  Fast overlapping of protein contact maps by alignment of eigenvectors , 2010, Bioinform..

[13]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[14]  Allen Holder,et al.  A Spectral Approach to Protein Structure Alignment , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  P. Koehl,et al.  Protein structure similarities. , 2001, Current opinion in structural biology.

[16]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[17]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[18]  W. Kabsch A discussion of the solution for the best rotation to relate two sets of vectors , 1978 .

[19]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[20]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[21]  Rumen Andonov,et al.  Maximum Contact Map Overlap Revisited , 2011, J. Comput. Biol..