MethodSubstantial deletion overlap among divergent Arabidopsis genomes revealed by intersection of short reads and tiling arrays

Identification of small polymorphisms from next generation sequencing short read data is relatively easy, but detection of larger deletions is less straightforward. Here, we analyzed four divergent Arabidopsis accessions and found that intersection of absent short read coverage with weak tiling array hybridization signal reliably flags deletions. Interestingly, individual deletions were frequently observed in two or more of the accessions examined, suggesting that variation in gene content partly reflects a common history of deletion events.

[1]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[2]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[3]  R. Hudson,et al.  Single-nucleotide mutation rate increases close to insertions/deletions in eukaryotes , 2008, Nature.

[4]  R. Lister,et al.  Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond. , 2009, Current opinion in plant biology.

[5]  Yufeng Shen,et al.  Comparing Platforms for C. elegans Mutant Identification Using High-Throughput Whole-Genome Sequencing , 2008, PloS one.

[6]  D. Bentley,et al.  Whole-genome re-sequencing. , 2006, Current opinion in genetics & development.

[7]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[8]  J. Pritchard,et al.  Characterizing natural variation using next-generation sequencing technologies. , 2009, Trends in genetics : TIG.

[9]  Mark B Gerstein,et al.  Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing , 2006, BMC Genomics.

[10]  G. Zeller,et al.  Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance , 2009, Molecular systems biology.

[11]  Timothy B. Stockwell,et al.  Evaluation of next generation sequencing platforms for population targeted sequencing studies , 2009, Genome Biology.

[12]  Gunnar Rätsch,et al.  Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays. , 2008, Genome research.

[13]  Peilin Jia,et al.  Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789 , 2007, Proceedings of the National Academy of Sciences.

[14]  Justin O. Borevitz,et al.  Natural Selection Shapes Genome-Wide Patterns of Copy-Number Polymorphism in Drosophila melanogaster , 2008, Science.

[15]  Jake K. Byrnes,et al.  Whole genome transcriptome polymorphisms in Arabidopsis thaliana , 2008, Genome Biology.

[16]  Juliane C. Dohm,et al.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing , 2008, Nucleic acids research.

[17]  Richard M. Clark,et al.  Common Sequence Polymorphisms Shaping Genetic Diversity in Arabidopsis thaliana , 2007, Science.

[18]  Richard M. Clark,et al.  Sequencing of natural strains of Arabidopsis thaliana with short reads. , 2008, Genome research.

[19]  R. Doerge,et al.  Epigenetic Natural Variation in Arabidopsis thaliana , 2007, PLoS biology.

[20]  Detlef Weigel,et al.  Large-scale identification of single-feature polymorphisms in complex genomes. , 2003, Genome research.

[21]  Richard Gibbs,et al.  High-Precision, Whole-Genome Sequencing of Laboratory Strains Facilitates Genetic Studies , 2008, PLoS genetics.

[22]  C. Hardtke,et al.  Flowering as a Condition for Xylem Expansion in Arabidopsis Hypocotyl and Root , 2008, Current Biology.

[23]  C. Nusbaum,et al.  ALLPATHS: de novo assembly of whole-genome shotgun microreads. , 2008, Genome research.

[24]  Gabor T. Marth,et al.  Whole-genome sequencing and variant discovery in C. elegans , 2008, Nature Methods.