The medaka draft genome and insights into vertebrate genome evolution

Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination and developmental genetics. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including ∼2,900 new genes, using 5′-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.

[1]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  M. Daly,et al.  MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. , 1987, Genomics.

[3]  Colin N. Dewey,et al.  Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004, Nature.

[4]  Shuichi Asakawa,et al.  DMY is a Y-specific DM-domain gene required for male development in the medaka fish , 2002, Nature.

[5]  Peter F Stadler,et al.  The "fish-specific" Hox cluster duplication is coincident with the origin of teleosts. , 2006, Molecular biology and evolution.

[6]  M. Schartl,et al.  Noninvasive determination of genome size and ploidy level in fishes by flow cytometry: detection of triploid Poecilia formosa. , 2000, Cytometry.

[7]  M. Schartl,et al.  Medaka — a model organism from the far east , 2002, Nature Reviews Genetics.

[8]  Sumio Sugano,et al.  5′-end SAGE for the analysis of transcriptional start sites , 2004, Nature Biotechnology.

[9]  H. Horvitz,et al.  MicroRNA Expression in Zebrafish Embryonic Development , 2005, Science.

[10]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[11]  Y. Ojima,et al.  Cellular DNA contents of fishes determined by flow cytometry , 1990 .

[12]  James A. Cuff,et al.  Genome sequence, comparative analysis and haplotype structure of the domestic dog , 2005, Nature.

[13]  Jean L. Chang,et al.  Initial sequence of the chimpanzee genome and comparison with the human genome , 2005, Nature.

[14]  Jun Kawai,et al.  Heterotachy in Mammalian Promoter Evolution , 2006, PLoS genetics.

[15]  M. Kondo,et al.  Differences in recombination frequencies during female and male meioses of the sex chromosomes of the medaka, Oryzias latipes. , 2001, Genetical research.

[16]  G. Wagner,et al.  Hox cluster duplications and the opportunity for evolutionary novelties , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  A. Bird,et al.  Number of CpG islands and genes in human and mouse. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[18]  B. Birren,et al.  Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae , 2004, Nature.

[19]  Alan Christoffels,et al.  Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes. , 2004, Molecular biology and evolution.

[20]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[21]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[22]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[23]  A. Meyer,et al.  Genome duplication, a trait shared by 22000 species of ray-finned fish. , 2003, Genome research.

[24]  J. S. Nelson,et al.  Fishes of the world. , 1978 .

[25]  H. Bussemaker,et al.  The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. , 2003, Genome research.

[26]  John H Postlethwait,et al.  The zebrafish gene map defines ancestral vertebrate chromosomes. , 2005, Genome research.

[27]  N. Nagai,et al.  Geographic Variation and Diversity of the Cytochrome b Gene in Japanese Wild Populations of Medaka, Oryzias latipes , 2003, Zoological science.

[28]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[29]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[30]  A Suyama,et al.  Diverse transcriptional initiation revealed by fine, large‐scale mapping of mRNA start sites , 2001, EMBO reports.

[31]  J. Inoue,et al.  The mitochondrial genome of spotted green pufferfish Tetraodon nigroviridis (Teleostei: Tetraodontiformes) and divergence time estimation among model organisms in fishes. , 2006, Genes & genetic systems.

[32]  Katsumi Tsukamoto,et al.  Basal actinopterygian relationships: a mitogenomic perspective on the phylogeny of the "ancient fish". , 2003, Molecular phylogenetics and evolution.

[33]  E. Mauceli,et al.  Whole-genome sequence assembly for mammalian genomes: Arachne 2. , 2003, Genome research.

[34]  H. Piaggio Mathematical Analysis , 1955, Nature.

[35]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[36]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[37]  P Green,et al.  Base-calling of automated sequencer traces using phred. II. Error probabilities. , 1998, Genome research.

[38]  Yong Wang,et al.  An evaluation of new criteria for CpG islands in the human genome as gene markers , 2004, Bioinform..

[39]  Akihiro Shima,et al.  A medaka gene map: the trace of ancestral vertebrate proto-chromosomes revealed by comparative gene mapping. , 2004, Genome research.

[40]  Makoto Furutani-Seiki,et al.  The DNA sequence of medaka chromosome LG22. , 2007, Genomics.

[41]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[42]  Klaas Vandepoele,et al.  Major events in the genome evolution of vertebrates: paranome age and size differ considerably between ray-finned fishes and land vertebrates. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Michael Q. Zhang,et al.  Large-scale human promoter mapping using CpG islands , 2000, Nature Genetics.

[44]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.

[45]  R. Hinegardner,et al.  Cellular DNA Content and the Evolution of Teleostean Fishes , 1972, The American Naturalist.

[46]  J. Inoue,et al.  Major patterns of higher teleostean phylogenies: a new perspective based on 100 complete mitochondrial DNA sequences. , 2003, Molecular phylogenetics and evolution.

[47]  R. Nielsen,et al.  Synonymous and nonsynonymous rate variation in nuclear genes of mammals , 1998, Journal of Molecular Evolution.

[48]  P. Green,et al.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment. , 1998, Genome research.

[49]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[50]  L. Duret,et al.  Determinants of CpG islands: expression in early embryo and isochore structure. , 2001, Genome research.

[51]  Jianzhi Zhang Evolution of DMY, a newly emergent male sex-determination gene of medaka fish. , 2004, Genetics.

[52]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[53]  R. Guigó,et al.  GeneID in Drosophila. , 2000, Genome research.

[54]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[55]  Toki-O Yamamoto,et al.  Artificially induced sex‐reversal in genotypic males of the medaka (Oryzias latipes) , 1953 .

[56]  C. Cantor,et al.  Automated genotyping using the DNA MassArray technology. , 2001, Methods in molecular biology.

[57]  Y. Yan,et al.  Zebrafish comparative genomics and the origins of vertebrate chromosomes. , 2000, Genome research.

[58]  M. Schartl,et al.  Genomic organization of the sex-determining and adjacent regions of the sex chromosomes of medaka. , 2006, Genome research.

[59]  Paramvir S. Dehal,et al.  Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes , 2002, Science.

[60]  B. Venkatesh,et al.  The mitochondrial genome of Indonesian coelacanth Latimeria menadoensis (Sarcopterygii: Coelacanthiformes) and divergence time estimation between the two coelacanths. , 2005, Gene.

[61]  A. Meyer,et al.  Phylogenetic Timing of the Fish-Specific Genome Duplication Correlates with the Diversification of Teleost Fish , 2004, Journal of Molecular Evolution.

[62]  Sudhir Kumar,et al.  Genomics. Vertebrate genomes compared. , 2002, Science.

[63]  J. Doležel,et al.  Nuclear DNA content and genome size of trout and human. , 2003, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[64]  J. Wittbrodt,et al.  Medaka and zebrafish, an evolutionary twin study , 2004, Mechanisms of Development.

[65]  M. Frommer,et al.  CpG islands in vertebrate genomes. , 1987, Journal of molecular biology.

[66]  T Aida,et al.  On the Inheritance of Color in a Fresh-Water Fish, APLOCHEILUS LATIPES Temmick and Schlegel, with Special Reference to Sex-Linked Inheritance. , 1921, Genetics.

[67]  K. Naruse,et al.  Molecular phylogeny of the medaka fishes genus Oryzias (Beloniformes: Adrianichthyidae) based on nuclear and mitochondrial DNA sequences. , 2005, Molecular phylogenetics and evolution.

[68]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[69]  E. Lander,et al.  Genomic mapping by fingerprinting random clones: a mathematical analysis. , 1988, Genomics.

[70]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[71]  Y. Hyodo-Taguchi Inbred strains of the medaka, Oryzias latipes , 1996 .

[72]  M. Kondo,et al.  A detailed linkage map of medaka, Oryzias latipes: comparative genomics and genome evolution. , 2000, Genetics.

[73]  A. Bird CpG islands as gene markers in the vertebrate nucleus , 1987 .

[74]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[75]  Daiya Takai,et al.  Comprehensive analysis of CpG islands in human chromosomes 21 and 22 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[76]  C. Burge,et al.  Computational inference of homologous gene structures in the human genome. , 2001, Genome research.

[77]  Shirley Soukup,et al.  Evolution by gene duplication. S. Ohno. Springer‐Verlag, New York. 1970. 160 pp , 1974 .

[78]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[79]  D. Seigneurin [Cytometry]. , 2020, Annales de Pathologie.

[80]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[81]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[82]  Sudhir Kumar,et al.  Vertebrate Genomes Compared , 2002, Science.

[83]  Y. Kohara,et al.  Medaka genomics: a bridge between mutant phenotype and gene function , 2004, Mechanisms of Development.