Detecting heterozygosity in shotgun genome assemblies: Lessons from obligately outcrossing nematodes.

The majority of nematodes are gonochoristic (dioecious) with distinct male and female sexes, but the best-studied species, Caenorhabditis elegans, is a self-fertile hermaphrodite. The sequencing of the genomes of C. elegans and a second hermaphrodite, C. briggsae, was facilitated in part by the low amount of natural heterozygosity, which typifies selfing species. Ongoing genome projects for gonochoristic Caenorhabditis species seek to approximate this condition by intense inbreeding prior to sequencing. Here we show that despite this inbreeding, the heterozygous fraction of the whole genome shotgun assemblies of three gonochoristic Caenorhabditis species, C. brenneri, C. remanei, and C. japonica, is considerable. We first demonstrate experimentally that independently assembled sequence variants in C. remanei and C. brenneri are allelic. We then present gene-based approaches for recognizing heterozygous regions of WGS assemblies. We also develop a simple method for quantifying heterozygosity that can be applied to assemblies lacking gene annotations. Consistently we find that approximately 10% and 30% of the C. remanei and C. brenneri genomes, respectively, are represented by two alleles in the assemblies. Heterozygosity is restricted to autosomes and its retention is accompanied by substantial inbreeding depression, suggesting that it is caused by multiple recessive deleterious alleles and not merely by chance. Both the overall amount and chromosomal distribution of heterozygous DNA is highly variable between assemblies of close relatives produced by identical methodologies, and allele frequencies have continued to change after strains were sequenced. Our results highlight the impact of mating systems on genome sequencing projects.

[1]  N. Munakata [Genetics of Caenorhabditis elegans]. , 1989, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[2]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[3]  Shanping Wang,et al.  Rapid Coevolution of the Nematode Sex-Determining Genes fem-3 and tra-2 , 2002, Current Biology.

[4]  Fabio Piano,et al.  Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Inna Dubchak,et al.  Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. , 2005, Genome research.

[6]  Andrew R. Jackson,et al.  The Genome of the Sea Urchin Strongylocentrotus purpuratus , 2006, Science.

[7]  E. Haag,et al.  Intraspecific variation in fem-3 and tra-2, two rapidly coevolving nematode sex-determining genes. , 2005, Gene.

[8]  J. Walters,et al.  Levels of DNA polymorphism vary with mating system in the nematode genus caenorhabditis. , 2002, Genetics.

[9]  R. Frankham,et al.  Decline in heterozygosity under full-sib and double first-cousin inbreeding in Drosophila melanogaster. , 1994, Genetics.

[10]  Karin Kiontke,et al.  Trends, Stasis, and Drift in the Evolution of Nematode Vulva Development , 2007, Current Biology.

[11]  B. Charlesworth,et al.  Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila , 2007, Nature.

[12]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[13]  Michael S Waterman,et al.  Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi. , 2007, Genome research.

[14]  B. Rost Twilight zone of protein sequence alignments. , 1999, Protein engineering.

[15]  Asif Chinwalla,et al.  Comparison of C. elegans and C. briggsae Genome Sequences Reveals Extensive Conservation of Chromosome Organization and Synteny , 2007, PLoS biology.

[16]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[17]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[18]  Jill P Mesirov,et al.  Assembly of polymorphic genomes: algorithms and application to Ciona savignyi. , 2005, Genome research.

[19]  Jonathan E. Allen,et al.  Draft Genome of the Filarial Nematode Parasite Brugia malayi , 2007, Science.

[20]  Melanie A. Huntley,et al.  Evolution of genes and genomes on the Drosophila phylogeny , 2007, Nature.

[21]  D. Pilgrim,et al.  Genetic flexibility in the convergent evolution of hermaphroditism in Caenorhabditis nematodes. , 2006, Developmental cell.

[22]  Paul W. Sternberg,et al.  Genome Sequence of Additional Caenorhabditis species : Enhancing the Utility of C . elegans as a Model Organism , 2003 .

[23]  T. Nagylaki Introduction to Theoretical Population Genetics , 1992 .

[24]  B. Latter,et al.  Genetic adaptation to captivity and inbreeding depression in small laboratory populations of Drosophila melanogaster. , 1995, Genetics.

[25]  W. Wood The Nematode Caenorhabditis elegans , 1988 .

[26]  Brian Charlesworth,et al.  INBREEDING AND OUTBREEDING DEPRESSION IN CAENORHABDITIS NEMATODES , 2007, Evolution; international journal of organic evolution.

[27]  Matthew M. Hill,et al.  A haplome alignment and reference sequence of the highly polymorphic Ciona savignyi genome , 2007, Genome Biology.

[28]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[29]  E. Haag,et al.  Regulatory elements required for development of caenorhabditis elegans hermaphrodites are conserved in the tra-2 homologue of C. remanei, a male/female sister species. , 2000, Genetics.

[30]  R. Durbin,et al.  The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics , 2003, PLoS biology.

[31]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[32]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[33]  C. Aquadro,et al.  Sexual isolation in Drosophila melanogaster: a possible case of incipient speciation. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[34]  O. Frydenberg POPULATION STUDIES OF A LETHAL MUTANT IN DROSOPHILA MELANOGASTER , 2009 .

[35]  Z. Gu,et al.  Extent of gene duplication in the genomes of Drosophila, nematode, and yeast. , 2002, Molecular biology and evolution.

[36]  Colin N. Dewey,et al.  Population Genomics: Whole-Genome Analysis of Polymorphism and Divergence in Drosophila simulans , 2007, PLoS biology.

[37]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[38]  B. Payseur,et al.  Selection at linked sites in the partial selfer Caenorhabditis elegans. , 2003, Molecular biology and evolution.

[39]  Tim Schedl,et al.  fog-2 and the Evolution of Self-Fertile Hermaphroditism in Caenorhabditis , 2004, PLoS biology.

[40]  P. Kuwabara,et al.  Cloning by synteny: identifying C. briggsae homologues of C. elegans genes. , 1994, Nucleic acids research.

[41]  D. Charlesworth,et al.  High Nucleotide Polymorphism and Rapid Decay of Linkage Disequilibrium in Wild Populations of Caenorhabditis remanei , 2006, Genetics.

[42]  S. E. Baird Haldane's rule by sexual transformation in Caenorhabditis. , 2002, Genetics.

[43]  Randall A. Bolanos,et al.  Whole-genome shotgun assembly and comparison of human genome assemblies , 2004, Proceedings of the National Academy of Sciences of the United States of America.