Construction of Pseudomolecule Sequences of the aus Rice Cultivar Kasalath for Comparative Genomics of Asian Cultivated Rice

Having a deep genetic structure evolved during its domestication and adaptation, the Asian cultivated rice (Oryza sativa) displays considerable physiological and morphological variations. Here, we describe deep whole-genome sequencing of the aus rice cultivar Kasalath by using the advanced next-generation sequencing (NGS) technologies to gain a better understanding of the sequence and structural changes among highly differentiated cultivars. The de novo assembled Kasalath sequences represented 91.1% (330.55 Mb) of the genome and contained 35 139 expressed loci annotated by RNA-Seq analysis. We detected 2 787 250 single-nucleotide polymorphisms (SNPs) and 7393 large insertion/deletion (indel) sites (>100 bp) between Kasalath and Nipponbare, and 2 216 251 SNPs and 3780 large indels between Kasalath and 93-11. Extensive comparison of the gene contents among these cultivars revealed similar rates of gene gain and loss. We detected at least 7.39 Mb of inserted sequences and 40.75 Mb of unmapped sequences in the Kasalath genome in comparison with the Nipponbare reference genome. Mapping of the publicly available NGS short reads from 50 rice accessions proved the necessity and the value of using the Kasalath whole-genome sequence as an additional reference to capture the sequence polymorphisms that cannot be discovered by using the Nipponbare sequence alone.

[1]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[2]  J. Bailey-Serres,et al.  Sub1A is an ethylene-response-factor-like gene that confers submergence tolerance to rice , 2006, Nature.

[3]  D. Brar,et al.  Rice genetics from Mendel to functional genomics. , 2001 .

[4]  Vipin T. Sreedharan,et al.  Multiple reference genomes and transcriptomes for Arabidopsis thaliana , 2011, Nature.

[5]  Saurabh Raghuvanshi,et al.  The sequence of rice chromosomes 11 and 12, rich in disease resistance genes and recent gene duplications , 2005, BMC Biology.

[6]  Eiji Yamamoto,et al.  OGRO: The Overview of functionally characterized Genes in Rice online database , 2012, Rice.

[7]  Richard Mott,et al.  EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA , 1997, Comput. Appl. Biosci..

[8]  Satoshi Natsume,et al.  Genome sequencing reveals agronomically important loci in rice using MutMap , 2012, Nature Biotechnology.

[9]  H. Kanamori,et al.  Comparative analysis of complete orthologous centromeres from two subspecies of rice reveals rapid variation of centromere organization and structure. , 2009, The Plant journal : for cell and molecular biology.

[10]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[11]  M. T. Jackson,et al.  The genetic structure and conservation of aus, aman and boro rices from Bangladesh , 1999, Genetic Resources and Crop Evolution.

[12]  Xian-Jun Song,et al.  The ethylene response factors SNORKEL1 and SNORKEL2 allow rice to adapt to deep water , 2009, Nature.

[13]  A. Korte,et al.  The advantages and limitations of trait analysis with GWAS: a review , 2013, Plant Methods.

[14]  Xuehui Huang,et al.  Mapping 49 quantitative trait loci at high resolution through sequencing-based genotyping of rice recombinant inbred lines , 2010, Theoretical and Applied Genetics.

[15]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[16]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[17]  Qian Qian,et al.  Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm , 2011, Nature Genetics.

[18]  R. Chiodini,et al.  The impact of next-generation sequencing on genomics. , 2011, Journal of genetics and genomics = Yi chuan xue bao.

[19]  P. Pesaresi,et al.  The protein kinase Pstol1 from traditional rice confers tolerance of phosphorus deficiency , 2012, Nature.

[20]  李佩芳 International Rice Genome Sequencing Project. 2005. The map-based sequence of the rice genome. , 2005 .

[21]  Amanda J. Garris,et al.  Genetic structure and diversity in Oryza sativa , 2004 .

[22]  Amanda J. Garris,et al.  Genetic Structure and Diversity in Oryza sativa L. , 2005, Genetics.

[23]  C. Kole,et al.  Arabidopsis Genome Initiative , 2016 .

[24]  D. Schwartz,et al.  Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data , 2013, Rice.

[25]  M. Yano,et al.  An SNP Caused Loss of Seed Shattering During Rice Domestication , 2006, Science.

[26]  Jian Wang,et al.  Dissecting yield-associated loci in super hybrid rice by resequencing recombinant inbred lines and improving parental genome sequences , 2013, Proceedings of the National Academy of Sciences.

[27]  Lin Fang,et al.  Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes , 2011, Nature Biotechnology.

[28]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[29]  S. Lin,et al.  A high-density rice genetic linkage map with 2275 markers using a single F2 population. , 1998, Genetics.

[30]  Xuehui Huang,et al.  Genome-Wide Analysis of Transposon Insertion Polymorphisms Reveals Intraspecific Variation in Cultivated Rice1[W][OA] , 2008, Plant Physiology.

[31]  Kazuhiko Sugimoto,et al.  Molecular cloning of Sdr4, a regulator involved in seed dormancy and domestication of rice , 2010, Proceedings of the National Academy of Sciences.

[32]  H. Kanamori,et al.  A BAC physical map of aus rice cultivar 'Kasalath', and the map-based genomic sequence of 'Kasalath' chromosome 1. , 2013, The Plant journal : for cell and molecular biology.

[33]  W. F. Thompson,et al.  Rapid isolation of high molecular weight plant DNA. , 1980, Nucleic acids research.

[34]  Yoshihiro Kawahara,et al.  Rice Annotation Project Database (RAP-DB): An Integrative and Interactive Database for Rice Genomics , 2013, Plant & cell physiology.

[35]  E. Eichler,et al.  Limitations of next-generation genome sequence assembly , 2011, Nature Methods.

[36]  Takeshi Itoh,et al.  mRNA-Seq Reveals a Comprehensive Transcriptome Profile of Rice under Phosphate Stress , 2011, Rice.

[37]  Kaworu Ebana,et al.  Deletion in a gene associated with grain size increased yields during rice domestication , 2008, Nature Genetics.

[38]  Bin Han,et al.  Resequencing rice genomes: an emerging new era of rice genomics. , 2013, Trends in genetics : TIG.

[39]  G. Khush Origin, dispersal, cultivation and variation of rice , 1997, Plant Molecular Biology.

[40]  M. Yamasaki,et al.  Artificial selection for a green revolution gene during japonica rice domestication , 2011, Proceedings of the National Academy of Sciences.

[41]  M. Bevan,et al.  The Arabidopsis genome: a foundation for plant research. , 2005, Genome research.

[42]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[43]  Meizhong Luo,et al.  Dynamic intra-japonica subspecies variation and resource application. , 2012, Molecular plant.

[44]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[45]  S. Mccouch,et al.  New insights into the history of rice domestication. , 2007, Trends in genetics : TIG.

[46]  Hiroaki Sakai,et al.  Massive gene losses in Asian cultivated rice unveiled by comparative genome analysis , 2010, BMC Genomics.

[47]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[48]  M. Yano,et al.  Hd1, a Major Photoperiod Sensitivity Quantitative Trait Locus in Rice, Is Closely Related to the Arabidopsis Flowering Time Gene CONSTANS , 2000, Plant Cell.

[49]  R. Durbin,et al.  Efficient de novo assembly of large genomes using compressed data structures. , 2012, Genome research.

[50]  L. Stein,et al.  Rice structural variation: a comparative analysis of structural variation between rice and three of its closest relatives in the genus Oryza. , 2010, The Plant journal : for cell and molecular biology.

[51]  Takeshi Itoh,et al.  Tasuke: a web-based visualization program for large-scale resequencing data , 2013, Bioinform..

[52]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[53]  V. Grant,et al.  Origin of Cultivated Rice , 1988 .