MDAT- Aligning multiple domain arrangements

BackgroundProteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.ResultsWe developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.ConclusionMDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat.

[1]  D. Higgins,et al.  See Blockindiscussions, Blockinstats, Blockinand Blockinauthor Blockinprofiles Blockinfor Blockinthis Blockinpublication Clustal: Blockina Blockinpackage Blockinfor Blockinperforming Multiple Blockinsequence Blockinalignment Blockinon Blockina Minicomputer Article Blockin Blockinin Blockin , 2022 .

[2]  Andrew E. Firth,et al.  GLUE-IT and PEDEL-AA: new programmes for analyzing protein diversity in randomized libraries , 2008, Nucleic Acids Res..

[3]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[4]  E. Sonnhammer,et al.  Evolution of protein domain architectures. , 2012, Methods in molecular biology.

[5]  Sarah A Teichmann,et al.  How do proteins gain new domains? , 2010, Genome Biology.

[6]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[7]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[8]  Richa Agarwala,et al.  COBALT: constraint-based alignment tool for multiple protein sequences , 2007, Bioinform..

[9]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[10]  A. Elofsson,et al.  Multi-domain proteins in the three kingdoms of life: orphan domains and other unassigned regions. , 2005, Journal of molecular biology.

[11]  S. Bryant,et al.  CDART: protein homology by domain architecture. , 2002, Genome research.

[12]  Caroline O. Buckee,et al.  Evolution of the Multi-Domain Structures of Virulence Genes in the Human Malaria Parasite, Plasmodium falciparum , 2012, PLoS Comput. Biol..

[13]  Burkhard Morgenstern,et al.  DIALIGN at GOBICS—multiple sequence alignment using various sources of external information , 2013, Nucleic Acids Res..

[14]  O. Gotoh Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. , 1996, Journal of molecular biology.

[15]  Alexandros Stamatakis,et al.  A daily-updated tree of (sequenced) life as a reference for genome research , 2013, Scientific Reports.

[16]  Colin Berry,et al.  Structure, diversity, and evolution of protein toxins from spore-forming entomopathogenic bacteria. , 2003, Annual review of genetics.

[17]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[18]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[19]  K. Katoh,et al.  MAFFT version 5: improvement in accuracy of multiple sequence alignment , 2005, Nucleic acids research.

[20]  F. Rentzsch,et al.  Repeated Evolution of Identical Domain Architecture in Metazoan Netrin Domain-Containing Proteins , 2012, Genome biology and evolution.

[21]  Olivier Poch,et al.  BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark , 2005, Proteins.

[22]  Minoru Kanehisa,et al.  Domain shuffling and the evolution of vertebrates. , 2009, Genome research.

[23]  Andrew D. Moore,et al.  Arrangements in the modular evolution of proteins. , 2008, Trends in biochemical sciences.

[24]  Benedict Paten,et al.  Sequence progressive alignment, a framework for practical large-scale probabilistic consistency alignment , 2009, Bioinform..

[25]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[26]  M. Petz,et al.  La enhances IRES-mediated translation of laminin B1 during malignant epithelial to mesenchymal transition , 2011, Nucleic acids research.

[27]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..

[28]  SödingJohannes Protein homology detection by HMM--HMM comparison , 2005 .

[29]  M. Levitt Nature of the protein universe , 2009, Proceedings of the National Academy of Sciences.

[30]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[31]  Erich Bornberg-Bauer,et al.  Dynamics and adaptive benefits of modular protein evolution. , 2013, Current opinion in structural biology.

[32]  Erich Bornberg-Bauer,et al.  Rapid similarity search of proteins using alignments of domain arrangements , 2014, Bioinform..

[33]  Erich Bornberg-Bauer,et al.  Dynamics and Adaptive Benefits of Protein Domain Emergence and Arrangements during Plant Genome Evolution , 2012, Genome biology and evolution.