Identification of copy number variation in French dairy and beef breeds using next-generation sequencing

AbstractBackgroundCopy number variations (CNV) are known to play a major role in genetic variability and disease pathogenesis in several species including cattle. In this study, we report the identification and characterization of CNV in eight French beef and dairy breeds using whole-genome sequence data from 200 animals. Bioinformatics analyses to search for CNV were carried out using four different but complementary tools and we validated a subset of the CNV by both in silico and experimental approaches. ResultsWe report the identification and localization of 4178 putative deletion-only, duplication-only and CNV regions, which cover 6% of the bovine autosomal genome; they were validated by two in silico approaches and/or experimentally validated using array-based comparative genomic hybridization and single nucleotide polymorphism genotyping arrays. The size of these variants ranged from 334 bp to 7.7 Mb, with an average size of ~ 54 kb. Of these 4178 variants, 3940 were deletions, 67 were duplications and 171 corresponded to both deletions and duplications, which were defined as potential CNV regions. Gene content analysis revealed that, among these variants, 1100 deletions and duplications encompassed 1803 known genes, which affect a wide spectrum of molecular functions, and 1095 overlapped with known QTL regions.ConclusionsOur study is a large-scale survey of CNV in eight French dairy and beef breeds. These CNV will be useful to study the link between genetic variability and economically important traits, and to improve our knowledge on the genomic architecture of cattle.

[1]  Tad S. Sonstegard,et al.  Design of a Bovine Low-Density SNP Array Optimized for Imputation , 2012, PloS one.

[2]  L. Andersson,et al.  A Genomic Duplication is Associated with Ectopic Eomesodermin Expression in the Embryonic Chicken Comb and Two Duplex-comb Phenotypes , 2015, PLoS genetics.

[3]  Leif Andersson,et al.  Copy Number Variation in Intron 1 of SOX5 Causes the Pea-comb Phenotype in Chickens , 2009, PLoS genetics.

[4]  L. Matukumalli,et al.  Detection of germline and somatic copy number variations in cattle. , 2008, Developments in biologicals.

[5]  D. Boichard,et al.  Pedig: a Fortran package for pedigree analysis suited for large populations. , 2002 .

[6]  K. Wimmers,et al.  Microarray-based transcriptional profiling of Eimeria bovis-infected bovine endothelial host cells , 2010, Veterinary research.

[7]  L. Andersson,et al.  Pigs with the dominant white coat color phenotype carry a duplication of the KIT gene encoding the mast/stem cell growth factor receptor , 1996, Mammalian Genome.

[8]  Jody A. Vandergriff,et al.  Erratum: PANTHER: A browsable database of gene products organized by biological function, using curated protein family and subfamily classification (Nucleic Acids Research (2003) vol. 31 (334-341)) , 2003 .

[9]  P. VanRaden,et al.  Genomic characteristics of cattle copy number variations , 2011, BMC Genomics.

[10]  Anushya Muruganujan,et al.  PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification , 2003, Nucleic Acids Res..

[11]  D. Boichard,et al.  High-density marker imputation accuracy in sixteen French cattle breeds , 2013, Genetics Selection Evolution.

[12]  B. Norris,et al.  A gene duplication affecting expression of the ovine ASIP gene is responsible for white and black sheep. , 2008, Genome research.

[13]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[14]  Kenny Q. Ye,et al.  Large-Scale Copy Number Polymorphism in the Human Genome , 2004, Science.

[15]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[16]  H. Shin,et al.  Identification of copy number variations and common deletion polymorphisms in cattle , 2010, BMC Genomics.

[17]  A. Valsesia,et al.  The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation , 2013, Front. Genet..

[18]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[19]  Jake K. Byrnes,et al.  Genome-wide association study of copy number variation in 16,000 cases of eight common diseases and 3,000 shared controls , 2010, Nature.

[20]  F. Samson,et al.  Genome-Wide Study of Structural Variants in Bovine Holstein, Montbéliarde and Normande Dairy Breeds , 2015, PloS one.

[21]  K. Gunderson,et al.  Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors , 2008, BMC Genomics.

[22]  D. Bickhart,et al.  Genomic regions showing copy number variations associate with resistance or susceptibility to gastrointestinal nematodes in Angus cattle , 2011, Functional & Integrative Genomics.

[23]  D. Boichard,et al.  Construction of a large collection of small genome variations in French dairy and beef breeds using whole-genome sequences , 2016, Genetics Selection Evolution.

[24]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[25]  Hongzhe Li,et al.  MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads , 2013, Front. Genet..

[26]  L. B. Larsen,et al.  Genome-wide association and biological pathway analysis for milk-fat composition in Danish Holstein and Danish Jersey cattle , 2014, BMC Genomics.

[27]  C. Drögemüller,et al.  Partial deletion of the bovine ED1 gene causes anhidrotic ectodermal dysplasia in cattle. , 2001, Genome research.

[28]  J. Keele,et al.  Genome‐wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds†,‡ , 2017, Animal genetics.

[29]  Val C. Sheffield,et al.  BBS7 is required for BBSome formation and its absence in mice results in Bardet-Biedl syndrome phenotypes and selective abnormalities in membrane protein trafficking , 2013, Journal of Cell Science.

[30]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[31]  N. Carter,et al.  Germline rates of de novo meiotic deletions and duplications causing several genomic disorders , 2008, Nature Genetics.

[32]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[33]  Rongling Li,et al.  Three novel single-nucleotide polymorphisms of complement component 4 gene (C4A) in Chinese Holstein cattle and their associations with milk performance traits and CH50. , 2012, Veterinary immunology and immunopathology.

[34]  D. Boichard,et al.  The value of using probabilities of gene origin to measure genetic variability in a population , 1997, Genetics Selection Evolution.

[35]  D. Boichard,et al.  A 3.7 Mb Deletion Encompassing ZEB2 Causes a Novel Polled and Multisystemic Syndrome in the Progeny of a Somatic Mosaic Bull , 2012, PloS one.

[36]  C. Bendixen,et al.  Copy number variation in the bovine genome , 2010, BMC Genomics.

[37]  Ryan Mills,et al.  Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants , 2011, Nature Biotechnology.

[38]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[39]  P. Hansen,et al.  Use of single nucleotide polymorphisms in candidate genes associated with daughter pregnancy rate for prediction of genetic merit for reproduction in Holstein cows. , 2016, Animal genetics.

[40]  Thomas Zichner,et al.  DELLY: structural variant discovery by integrated paired-end and split-read analysis , 2012, Bioinform..

[41]  B. Rovin,et al.  The Influence of CCL 3 L 1 Gene – Containing Segmental Duplications on HIV-1 / AIDS Susceptibility , 2009 .

[42]  Kai Ye,et al.  Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads , 2009, Bioinform..

[43]  Ryan M. Layer,et al.  LUMPY: a probabilistic framework for structural variant discovery , 2012, Genome Biology.

[44]  James M. Reecy,et al.  Developmental progress and current status of the Animal QTLdb , 2015, Nucleic Acids Res..

[45]  E. G. Cothran,et al.  Copy Number Variation in the Horse Genome , 2014, PLoS genetics.

[46]  Tad S Sonstegard,et al.  Analysis of copy number variations among diverse cattle breeds. , 2010, Genome research.

[47]  K. Lindblad-Toh,et al.  Duplication of FGF3, FGF4, FGF19 and ORAOV1 causes hair ridge and predisposition to dermoid sinus in Ridgeback dogs , 2007, Nature Genetics.

[48]  Bradley P. Coe,et al.  Genome structural variation discovery and genotyping , 2011, Nature Reviews Genetics.

[49]  James Hadfield,et al.  The pitfalls of platform comparison: DNA copy number array technologies assessed , 2009, BMC Genomics.

[50]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[51]  Judy H Cho,et al.  Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease , 2008, Nature Genetics.

[52]  Robert D Schnabel,et al.  Copy number variation of individual cattle genomes using next-generation sequencing. , 2012, Genome research.

[53]  B. Hayes,et al.  Detection and validation of structural variations in bovine whole-genome sequence data , 2017, Genetics Selection Evolution.