Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.

[1]  B F Lang,et al.  The Complete Mitochondrial DNA Sequences of Nephroselmis olivacea and Pedinomonas minor: Two Radically Different Evolutionary Patterns within Green Algae , 1999, Plant Cell.

[2]  B F Lang,et al.  Complete Sequence of the Mitochondrial DNA of the Red Alga Porphyra purpurea: Cyanobacterial Introns and Shared Ancestry of Red and Green Algae , 1999, Plant Cell.

[3]  Claire O'Donovan,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 , 1999, Nucleic Acids Res..

[4]  T. Sicheritz-Pontén,et al.  The genome sequence of Rickettsia prowazekii and the origin of mitochondria , 1998, Nature.

[5]  S. Pääbo,et al.  The mitochondrial genome of the hemichordate Balanoglossus carnosus and the evolution of deuterostome mitochondria. , 1998, Genetics.

[6]  T. Kuroiwa,et al.  Structure and organization of the mitochondrial genome of the unicellular red alga Cyanidioschyzon merolae deduced from the complete nucleotide sequence. , 1998, Nucleic acids research.

[7]  Nick Goldman,et al.  Phylogenetic information and experimental design in molecular systematics , 1998, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[8]  A. Veuthey,et al.  Phylogenetic Relationships of Fungi, Plantae, and Animalia Inferred from Homologous Comparison of Ribosomal Proteins , 1998, Journal of Molecular Evolution.

[9]  C. Borchiellini,et al.  Phylogenetic analysis of the Hsp70 sequences reveals the monophyly of Metazoa and specific phylogenetic relationships between animals and fungi. , 1998, Molecular biology and evolution.

[10]  S. Pääbo,et al.  Codon reassignment and amino acid composition in hemichordate mitochondria. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Z. Yang On the best evolutionary rate for phylogenetic analysis. , 1998, Systematic biology.

[12]  R. Okimoto,et al.  The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a near-standard genetic code. , 1998, Genetics.

[13]  G. H. Coombs,et al.  Evolutionary relationships among protozoa. , 1998 .

[14]  J. Thompson,et al.  The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. , 1997, Nucleic acids research.

[15]  D. Sankoff,et al.  An ancestral mitochondrial DNA resembling a eubacterial genome in miniature , 1997, Nature.

[16]  D A Morrison,et al.  Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa. , 1997, Molecular biology and evolution.

[17]  L. Jermiin,et al.  Nucleotide Composition Bias Affects Amino Acid Content in Proteins Coded by Animal Mitochondria , 1997, Journal of Molecular Evolution.

[18]  A. Brennicke,et al.  The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides , 1997, Nature Genetics.

[19]  Jean Thioulouse,et al.  ADE-4: a multivariate analysis and graphical display software , 1997, Stat. Comput..

[20]  Grit Herrmann,et al.  CONRAD: a method for identification of variable and conserved regions within proteins by scale-space filtering , 1996, Comput. Appl. Biosci..

[21]  B F Lang,et al.  The mitochondrial DNA of Allomyces macrogynus: the complete genomic sequence from an ancestral fungus. , 1996, Journal of molecular biology.

[22]  J. Adachi,et al.  MOLPHY version 2.3 : programs for molecular phylogenetics based on maximum likelihood , 1996 .

[23]  C. Farr,et al.  Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons , 1995, Insect molecular biology.

[24]  J. Boore,et al.  Complete sequence of the mitochondrial DNA of the annelid worm Lumbricus terrestris. , 1995, Genetics.

[25]  C. Boyen,et al.  Complete sequence of the mitochondrial DNA of the rhodophyte Chondrus crispus (Gigartinales). Gene content and genome organization. , 1995, Journal of molecular biology.

[26]  S. Asakawa,et al.  Nucleotide sequence and gene organization of the starfish Asterina pectinifera mitochondrial genome. , 1995, Genetics.

[27]  W C Wheeler,et al.  Elision: a method for accommodating multiple molecular sequence alignments with alignment-ambiguous sites. , 1995, Molecular phylogenetics and evolution.

[28]  K. Lonergan,et al.  The mitochondrial DNA of the amoeboid protozoon, Acanthamoeba castellanii: complete sequence, gene content and genome organization. , 1995, Journal of molecular biology.

[29]  A. Rodrigo,et al.  Inadequate Support for an Evolutionary Link between the Metazoa and the Fungi , 1994 .

[30]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[31]  J. Boore,et al.  Complete DNA sequence of the mitochondrial genome of the black chiton, Katharina tunicata. , 1994, Genetics.

[32]  B F Lang,et al.  Complete sequence of the mitochondrial DNA of the chlorophyte alga Prototheca wickerhamii. Gene content and genome organization. , 1994, Journal of molecular biology.

[33]  S. Beverley,et al.  Evolution of nuclear ribosomal RNAs in kinetoplastid protozoa: perspectives on the age and origins of parasitism. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[34]  J. Palmer,et al.  Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[35]  T. Cavalier-smith,et al.  Kingdom protozoa and its 18 phyla. , 1993, Microbiological reviews.

[36]  R DeSalle,et al.  Alignment-ambiguous nucleotide sites and the exclusion of systematic data. , 1993, Molecular phylogenetics and evolution.

[37]  S. Stickel,et al.  Monophyletic origins of the metazoa: an evolutionary link with fungi , 1993, Science.

[38]  T. Yagi The bacterial energy-transducing NADH-quinone oxidoreductases. , 1993, Biochimica et biophysica acta.

[39]  G. Olsen,et al.  Ribosomal RNA: a key to phylogeny , 1993, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[40]  A statistical method for detecting regions with different evolutionary dynamics in multialigned sequences. , 1992, Molecular phylogenetics and evolution.

[41]  K. Oda,et al.  Gene organization deduced from the complete sequence of liverwort Marchantia polymorpha mitochondrial DNA. A primitive form of plant mitochondrial genome. , 1992, Journal of molecular biology.

[42]  J A Lake,et al.  The order of sequence alignment can bias the selection of tree topology. , 1991, Molecular biology and evolution.

[43]  M. Gouy,et al.  Molecular phylogeny of the kingdoms Animalia, Plantae, and Fungi. , 1989, Molecular biology and evolution.

[44]  C R Woese,et al.  Mitochondrial origins. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[45]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .