Performance assessment of copy number microarray platforms using a spike-in experiment

MOTIVATION Changes in the copy number of chromosomal DNA segments [copy number variants (CNVs)] have been implicated in human variation, heritable diseases and cancers. Microarray-based platforms are the current established technology of choice for studies reporting these discoveries and constitute the benchmark against which emergent sequence-based approaches will be evaluated. Research that depends on CNV analysis is rapidly increasing, and systematic platform assessments that distinguish strengths and weaknesses are needed to guide informed choice. RESULTS We evaluated the sensitivity and specificity of six platforms, provided by four leading vendors, using a spike-in experiment. NimbleGen and Agilent platforms outperformed Illumina and Affymetrix in accuracy and precision of copy number dosage estimates. However, Illumina and Affymetrix algorithms that leverage single nucleotide polymorphism (SNP) information make up for this disadvantage and perform well at variant detection. Overall, the NimbleGen 2.1M platform outperformed others, but only with the use of an alternative data analysis pipeline to the one offered by the manufacturer. AVAILABILITY The data is available from http://rafalab.jhsph.edu/cnvcomp/. CONTACT pevsner@jhmi.edu; fspencer@jhmi.edu; rafa@jhu.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Tomas W. Fitzgerald,et al.  Origins and functional impact of copy number variation in the human genome , 2010, Nature.

[2]  Emmanuel Barillot,et al.  Analysis of array CGH data: from signal ratio to gain and loss of DNA regions , 2004, Bioinform..

[3]  Joshua M. Korn,et al.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs , 2008, Nature Genetics.

[4]  N. Carter Methods and strategies for analyzing copy number variation using DNA microarrays , 2007, Nature Genetics.

[5]  John Quackenbush,et al.  Multiple-laboratory comparison of microarray platforms , 2005, Nature Methods.

[6]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[7]  Rafael A Irizarry,et al.  Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. , 2006, Biostatistics.

[8]  Tomas W. Fitzgerald,et al.  Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization , 2007, Genome Biology.

[9]  S. P. Fodor,et al.  Large-scale genotyping of complex DNA , 2003, Nature Biotechnology.

[10]  Sharon J. Diskin,et al.  Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms , 2008, Nucleic acids research.

[11]  K. Gunderson,et al.  Whole genome genotyping technologies on the BeadArray™ platform , 2007 .

[12]  J. Lupski,et al.  Genomic Rearrangements and Gene Copy-Number Alterations as a Cause of Nervous System Disorders , 2006, Neuron.

[13]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy-number changes using cDNA microarrays , 1999, Nature Genetics.

[14]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[15]  L. Feuk,et al.  Detection of large-scale variation in the human genome , 2004, Nature Genetics.

[16]  Johan Staaf,et al.  Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios , 2008, BMC Bioinformatics.

[17]  Ingo Ruczinski,et al.  Hidden Markov models for the assessment of chromosomal alterations using high-throughput SNP arrays. , 2008, The annals of applied statistics.

[18]  Felix Naef,et al.  Absolute mRNA concentrations from sequence-specific calibration of oligonucleotide arrays. , 2003, Nucleic acids research.

[19]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[20]  C. Yau,et al.  QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data , 2007, Nucleic acids research.

[21]  J. Sebat,et al.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. , 2003, Genome research.

[22]  Peter J. Park,et al.  Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data , 2005, Bioinform..

[23]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[24]  D. Conrad,et al.  Global variation in copy number in the human genome , 2006, Nature.

[25]  Rafael A. Irizarry,et al.  R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips , 2009, Bioinform..

[26]  Simon Tavaré,et al.  Statistical issues in the analysis of Illumina data , 2008, BMC Bioinformatics.

[27]  Jing Huang,et al.  Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays , 2005, Bioinform..

[28]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy number variation in breast cancer using DNA microarrays , 1999, Nature Genetics.

[29]  S. Swamy,et al.  PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data , 2009, Biostatistics.

[30]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[31]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[32]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[33]  J. Pritchard,et al.  Review Characterizing , 2022 .

[34]  E. Birney,et al.  Challenges and standards in integrating surveys of structural variation , 2007, Nature Genetics.

[35]  Marieke E. Timmerman,et al.  Smoothing waves in array CGH tumor profiles , 2009, Bioinform..

[36]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[37]  Matthew E Hurles,et al.  The functional impact of structural variation in humans. , 2008, Trends in genetics : TIG.