Augmenting Chinese hamster genome assembly by identifying regions of high confidence.

Chinese hamster Ovary (CHO) cell lines are the dominant industrial workhorses for therapeutic recombinant protein production. The availability of genome sequence of Chinese hamster and CHO cells will spur further genome and RNA sequencing of producing cell lines. However, the mammalian genomes assembled using shot-gun sequencing data still contain regions of uncertain quality due to assembly errors. Identifying high confidence regions in the assembled genome will facilitate its use for cell engineering and genome engineering. We assembled two independent drafts of Chinese hamster genome by de novo assembly from shotgun sequencing reads and by re-scaffolding and gap-filling the draft genome from NCBI for improved scaffold lengths and gap fractions. We then used the two independent assemblies to identify high confidence regions using two different approaches. First, the two independent assemblies were compared at the sequence level to identify their consensus regions as "high confidence regions" which accounts for at least 78 % of the assembled genome. Further, a genome wide comparison of the Chinese hamster scaffolds with mouse chromosomes revealed scaffolds with large blocks of collinearity, which were also compiled as high-quality scaffolds. Genome scale collinearity was complemented with EST based synteny which also revealed conserved gene order compared to mouse. As cell line sequencing becomes more commonly practiced, the approaches reported here are useful for assessing the quality of assembly and potentially facilitate the engineering of cell lines.

[1]  Clair Gallagher,et al.  Towards next generation CHO cell biology: Bioinformatics methods for RNA‐Seq‐based expression profiling , 2015, Biotechnology journal.

[2]  Madolyn L. MacDonald,et al.  CHOgenome.org 2.0: Genome resources and website updates , 2015, Biotechnology journal.

[3]  Nitya M. Jacob,et al.  Global insights into the Chinese hamster and CHO cell transcriptomes , 2015, Biotechnology and bioengineering.

[4]  Jose Lugo-Martinez,et al.  Extensive Error in the Number of Genes Inferred from Draft Genome Assemblies , 2014, PLoS Comput. Biol..

[5]  Wei-Shou Hu,et al.  Genomics and systems biotechnology in biopharmaceutical processing , 2014 .

[6]  F. Wurm CHO Quasispecies—Implications for Manufacturing Processes , 2013 .

[7]  Andreas Tauch,et al.  Chinese hamster genome sequenced from sorted chromosomes , 2013, Nature Biotechnology.

[8]  Edward J. O'Brien,et al.  Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome , 2013, Nature Biotechnology.

[9]  H. C. Mak,et al.  Genome interpretation and assembly—recent progress and next steps , 2012, Nature Biotechnology.

[10]  W. Feeney The Chinese or Striped-Back Hamster , 2011, The Laboratory Rabbit, Guinea Pig, Hamster, and Other Rodents.

[11]  B. Ren,et al.  Genome-wide prediction of transcription factor binding sites using an integrated model , 2010, Genome Biology.

[12]  E. Liu,et al.  Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. , 2009, Genome research.

[13]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[14]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[15]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[16]  J. Greilhuber,et al.  Genome size of man and animals relative to the plant Allium cepa. , 1983, Canadian journal of genetics and cytology. Journal canadien de genetique et de cytologie.

[17]  Weichang Zhou,et al.  Mammalian cell cultures for biologics manufacturing. , 2014, Advances in biochemical engineering/biotechnology.

[18]  E. Zeiger,et al.  Chromosome aberrations and sister chromatid exchanges in chinese hamster ovary cells: Evaluations of 108 chemicals , 1987, Environmental and molecular mutagenesis.

[19]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[20]  Supplement To: the Genomic Sequence of the Chinese Hamster Ovary (cho)-k1 Cell Line , 2022 .