Whole Genome Profiling provides a robust framework for physical mapping and sequencing in the highly complex and repetitive wheat genome

BackgroundSequencing projects using a clone-by-clone approach require the availability of a robust physical map. The SNaPshot technology, based on pair-wise comparisons of restriction fragments sizes, has been used recently to build the first physical map of a wheat chromosome and to complete the maize physical map. However, restriction fragments sizes shared randomly between two non-overlapping BACs often lead to chimerical contigs and mis-assembled BACs in such large and repetitive genomes. Whole Genome Profiling (WGP™) was developed recently as a new sequence-based physical mapping technology and has the potential to limit this problem.ResultsA subset of the wheat 3B chromosome BAC library covering 230 Mb was used to establish a WGP physical map and to compare it to a map obtained with the SNaPshot technology. We first adapted the WGP-based assembly methodology to cope with the complexity of the wheat genome. Then, the results showed that the WGP map covers the same length than the SNaPshot map but with 30% less contigs and, more importantly with 3.5 times less mis-assembled BACs. Finally, we evaluated the benefit of integrating WGP tags in different sequence assemblies obtained after Roche/454 sequencing of BAC pools. We showed that while WGP tag integration improves assemblies performed with unpaired reads and with paired-end reads at low coverage, it does not significantly improve sequence assemblies performed at high coverage (25x) with paired-end reads.ConclusionsOur results demonstrate that, with a suitable assembly methodology, WGP builds more robust physical maps than the SNaPshot technology in wheat and that WGP can be adapted to any genome. Moreover, WGP tag integration in sequence assemblies improves low quality assembly. However, to achieve a high quality draft sequence assembly, a sequencing depth of 25x paired-end reads is required, at which point WGP tag integration does not provide additional scaffolding value. Finally, we suggest that WGP tags can support the efficient sequencing of BAC pools by enabling reliable assignment of sequence scaffolds to their BAC of origin, a feature that is of great interest when using BAC pooling strategies to reduce the cost of sequencing large genomes.

[1]  Carolyn Thomas,et al.  High-throughput fingerprinting of bacterial artificial chromosomes using the snapshot labeling kit and sizing of restriction fragments by capillary electrophoresis. , 2003, Genomics.

[2]  S. Kurtz,et al.  A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes , 2008, BMC Genomics.

[3]  Jan van Oeveren,et al.  Sequence-based physical mapping of complex genomes by whole genome profiling. , 2011, Genome research.

[4]  R. Wing,et al.  Physical mapping of the rice genome with BACs , 1997, Plant Molecular Biology.

[5]  James R. Knight,et al.  De Novo Next Generation Sequencing of Plant Genomes , 2009, Rice.

[6]  Pierre Sourdille,et al.  A Physical Map of the 1-Gigabase Bread Wheat Chromosome 3B , 2008, Science.

[7]  H. Shizuya,et al.  Contig assembly of bacterial artificial chromosome clones through multiplexed fluorescence-labeled fingerprinting. , 1999, Genomics.

[8]  H. Shizuya,et al.  Five-color-based high-information-content fingerprinting of bacterial artificial chromosome clones using type IIS restriction endonucleases. , 2001, Genomics.

[9]  Jiming Jiang,et al.  Current status and the future of fluorescence in situ hybridization (FISH) in plant genome research. , 2006, Genome.

[10]  Carol Soderlund,et al.  FPC: a system for building contigs from restriction fingerprinted clones , 1997, Comput. Appl. Biosci..

[11]  Marco Marra,et al.  A map for sequence analysis of the Arabidopsis thaliana genome , 1999, Nature Genetics.

[12]  S. Koren,et al.  Assembly algorithms for next-generation sequencing data. , 2010, Genomics.

[13]  Steven G. Schroeder,et al.  Physical and Genetic Structure of the Maize Genome Reflects Its Complex Evolutionary History , 2007, PLoS genetics.

[14]  Haibao Tang,et al.  A draft physical map of a D-genome cotton species (Gossypium raimondii) , 2010, BMC Genomics.

[15]  R. Wing,et al.  Efficacy of clone fingerprinting methodologies. , 2007, Genomics.

[16]  E. Green,et al.  Sequence-tagged site (STS) content mapping of human chromosomes: theoretical considerations and early experiences. , 1991, PCR methods and applications.

[17]  F. Collins Has the revolution arrived? , 2010, Nature.

[18]  R. Wilson,et al.  High throughput fingerprint analysis of large-insert clones. , 1997, Genome research.

[19]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[20]  D. Shtienberg,et al.  A BAC/BIBAC-based physical map of chickpea, Cicer arietinum L , 2010, BMC Genomics.

[21]  E. Green Strategies for the systematic sequencing of complex genomes , 2001, Nature Reviews Genetics.

[22]  Carol Soderlund,et al.  Integrating sequence with FPC fingerprint maps , 2009, Nucleic acids research.

[23]  S. Cloutier,et al.  Physical mapping and BAC-end sequence analysis provide initial insights into the flax (Linum usitatissimum L.) genome , 2011, BMC Genomics.

[24]  R. Flavell,et al.  Characterisation of the wheat genome by renaturation kinetics , 1975, Chromosoma.

[25]  Hikmet Budak,et al.  Megabase Level Sequencing Reveals Contrasted Organization and Evolution Patterns of the Wheat Gene and Transposable Element Spaces[W] , 2010, Plant Cell.

[26]  D. Haussler,et al.  A physical map of the human genome , 2001, Nature.

[27]  C. Soderlund,et al.  Contigs built with fingerprints, markers, and FPC V4.7. , 2000, Genome research.

[28]  J. Doležel,et al.  Chromosome Genomics in the Triticeae , 2009 .

[29]  C. Scheuring,et al.  Genome physical mapping with large-insert bacterial clones by fingerprint analysis: methodologies, source clone genome coverage, and contig map quality. , 2004, Genomics.

[30]  B. Williams,et al.  An Integrated Physical and Genetic Map of the Rice Genome , 2002, The Plant Cell Online.

[31]  B. Birren,et al.  Genome Project Standards in a New Era of Sequencing , 2009, Science.