Haplotype-resolved genomes provide insights into structural variation and gene content in Angus and Brahman cattle

Inbred animals were historically chosen for genome analysis to circumvent assembly issues caused by haplotype variation but this resulted in a composite of the two genomes. Here we report a haplotype-aware scaffolding and polishing pipeline which was used to create haplotype-resolved, chromosome-level genome assemblies of Angus (taurine) and Brahman (indicine) cattle subspecies from contigs generated by the trio binning method. These assemblies reveal structural and copy number variants that differentiate the subspecies and that variant detection is sensitive to the specific reference genome chosen. Six genes with immune related functions have additional copies in the indicine compared with taurine lineage and an indicus-specific extra copy of fatty acid desaturase is under positive selection. The haplotyped genomes also enable transcripts to be phased to detect allele-specific expression. This work exemplifies the value of haplotype-resolved genomes to better explore evolutionary and functional variations. Taurine and indicine cattle have different desirable traits making them better adapted to different climates across the world. Here, Low et al. describe a pipeline to produce haplotype-resolved, chromosome-level genomes of Angus and Brahman cattle breeds from a crossbred individual and report on comparisons of the two genomes.

[1]  S. N. Naik Origin and domestication of Zebu cattle (Bos indicus) , 1978 .

[2]  H. Hansen,et al.  Essential function of linoleic acid esterified in acylglucosylceramide and acylceramide in maintaining the epidermal water permeability barrier. Evidence from feeding studies with oleate, linoleate, arachidonate, columbinate and alpha-linolenate. , 1985, Biochimica et biophysica acta.

[3]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[4]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[5]  Growth patterns of Nellore vs British beef cattle breeds assessed using a dynamic, mechanistic model of cattle growth and composition. , 2006 .

[6]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[7]  Peer Bork,et al.  PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments , 2006, Nucleic Acids Res..

[8]  Thomas Rattei,et al.  Gepard: a rapid and sensitive tool for creating dotplots on genome scale , 2007, Bioinform..

[9]  Robert J. Moore,et al.  Gene expression profiling of Hereford Shorthorn cattle following challenge with Boophilus microplus tick larvae , 2007 .

[10]  Ziheng Yang PAML 4: phylogenetic analysis by maximum likelihood. , 2007, Molecular biology and evolution.

[11]  S. Hiendleder,et al.  Complete mitochondrial genomes of Bos taurus and Bos indicus provide new insights into intra-species variation, taxonomy and domestication , 2008, Cytogenetic and Genome Research.

[12]  David R. Kelley,et al.  A whole-genome assembly of the domestic cow, Bos taurus , 2009, Genome Biology.

[13]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[14]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[15]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[16]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[17]  S. Alves,et al.  Genotype x environment interactions for fatty acid profiles in Bos indicus and Bos taurus finished on pasture or grain. , 2011, Journal of animal science.

[18]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[19]  Robert D Schnabel,et al.  Copy number variation of individual cattle genomes using next-generation sequencing. , 2012, Genome research.

[20]  M. Eberlin,et al.  Phosphatidylcholine and Sphingomyelin Profiles Vary in Bos taurus indicus and Bos taurus taurus In Vitro- and In Vivo-Produced Blastocysts1 , 2012, Biology of reproduction.

[21]  R. Gibbs,et al.  Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology , 2012, PloS one.

[22]  Ryan M. Layer,et al.  LUMPY: a probabilistic framework for structural variant discovery , 2012, Genome Biology.

[23]  J. Markworth,et al.  Arachidonic acid supplementation enhances in vitro skeletal muscle cell growth via a COX-2-dependent pathway. , 2013, American journal of physiology. Cell physiology.

[24]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[25]  M. Berchtold,et al.  The many faces of calmodulin in cell proliferation, programmed cell death, autophagy, and cancer. , 2014, Biochimica et biophysica acta.

[26]  P. VanRaden,et al.  Cattle Sex-Specific Recombination and Genetic Control from a Large Pedigree Analysis , 2015, PLoS genetics.

[27]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[28]  Jian Wang,et al.  De novo assembly of a haplotype-resolved human genome , 2015, Nature Biotechnology.

[29]  O. Kohany,et al.  Repbase Update, a database of repetitive elements in eukaryotic genomes , 2015, Mobile DNA.

[30]  Steven G. Schroeder,et al.  Genome sequencing of the extinct Eurasian wild aurochs, Bos primigenius, illuminates the phylogeography and evolution of cattle , 2015, Genome Biology.

[31]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[32]  Daisy E. Pagete An end-to-end assembly of the Aedes aegypti genome , 2016, 1605.04619.

[33]  Michael C. Schatz,et al.  Assemblytics: a web analytics tool for the detection of variants from an assembly , 2016, Bioinform..

[34]  Michael C. Schatz,et al.  Ribbon: Visualizing complex genome alignments and structural variation , 2016, bioRxiv.

[35]  Steven G. Schroeder,et al.  Diversity and population-genetic properties of copy number variations and multicopy genes in cattle , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[36]  Stinus Lindgreen,et al.  AdapterRemoval v2: rapid adapter trimming, identification, and read merging , 2016, BMC Research Notes.

[37]  Takao Saito,et al.  Association of Bovine Fatty Acid Desaturase 2 Gene Single-Nucleotide Polymorphisms with Intramuscular Fatty Acid Composition in Japanese Black Steers , 2016 .

[38]  Sarah C. Ayling,et al.  The Ensembl gene annotation system , 2016, Database J. Biol. Databases Curation.

[39]  Timothy P. L. Smith,et al.  Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with high-altitude pulmonary hypertension. , 2016, F1000Research.

[40]  Lennart Martens,et al.  SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification , 2017, bioRxiv.

[41]  Steven G. Schroeder,et al.  Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome , 2017, Nature Genetics.

[42]  O. Hanotte,et al.  The genome landscape of indigenous African cattle , 2017, Genome Biology.

[43]  Kari Stefansson,et al.  Graphtyper enables population-scale genotyping using pangenome graphs , 2017, Nature Genetics.

[44]  Neva C. Durand,et al.  De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds , 2016, Science.

[45]  Charles D. Johnson,et al.  Annotated Draft Genome Assemblies for the Northern Bobwhite (Colinus virginianus) and the Scaled Quail (Callipepla squamata) Reveal Disparate Estimates of Modern Genome Diversity and Historic Effective Population Size , 2017, G3: Genes, Genomes, Genetics.

[46]  A. Knebel,et al.  Coupled monoubiquitylation of the co-E3 ligase DCNL1 by Ariadne-RBR E3 ubiquitin ligases promotes cullin-RING ligase complex remodeling , 2018, The Journal of Biological Chemistry.

[47]  Sergey Koren,et al.  De novo assembly of haplotype-resolved genomes with trio binning , 2018, Nature Biotechnology.

[48]  M. Garg,et al.  Arachidonic acid supplementation modulates blood and skeletal muscle lipid profile with no effect on basal inflammation in resistance exercise trained men. , 2018, Prostaglandins, leukotrienes, and essential fatty acids.

[49]  R. Lyons,et al.  Sequencing the mosaic genome of Brahman cattle identifies historic and recent introgression including polled , 2018, Scientific Reports.

[50]  Adam M. Phillippy,et al.  MUMmer4: A fast and versatile genome alignment system , 2018, PLoS Comput. Biol..

[51]  M. Rothschild,et al.  A polymorphism in the fatty acid desaturase-2 gene is associated with the arachidonic acid metabolism in pigs , 2018, Scientific Reports.

[52]  W. Low,et al.  Rapid birth-death evolution and positive selection in detoxification-type glutathione S-transferases in mammals , 2018, PloS one.

[53]  Suman Kumar Choudhary,et al.  Primary structures of different isoforms of buffalo pregnancy-associated glycoproteins (BuPAGs) during early pregnancy and elucidation of the 3-dimensional structure of the most abundant isoform BuPAG 7 , 2018, PloS one.

[54]  Sergey Koren,et al.  Integrating Hi-C links with assembly graphs for chromosome-scale assembly , 2019, PLoS Comput. Biol..

[55]  Steven L Salzberg,et al.  Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype , 2019, Nature Biotechnology.

[56]  Timothy P. L. Smith,et al.  Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity , 2019, Nature Communications.

[57]  Sergey Koren,et al.  Integrating Hi-C links with assembly graphs for chromosome-scale assembly , 2018, bioRxiv.

[58]  Benjamin S. Arbuckle,et al.  Ancient cattle genomics, origins, and rapid turnover in the Fertile Crescent , 2019, Science.

[59]  David Stephen Horner,et al.  SMRT long reads and Direct Label and Stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica) , 2018, bioRxiv.

[60]  P. Baybayan,et al.  Variant Phasing and Haplotypic Expression from Single-molecule Long-read Sequencing in Maize , 2019, bioRxiv.

[61]  Jirimutu,et al.  Whole-genome sequencing of 128 camels across Asia reveals origin and migration of domestic Bactrian camels , 2020, Communications Biology.