ACNE: a summarization method to estimate allele-specific copy numbers for Affymetrix SNP arrays

MOTIVATION Current algorithms for estimating DNA copy numbers (CNs) borrow concepts from gene expression analysis methods. However, single nucleotide polymorphism (SNP) arrays have special characteristics that, if taken into account, can improve the overall performance. For example, cross hybridization between alleles occurs in SNP probe pairs. In addition, most of the current CN methods are focused on total CNs, while it has been shown that allele-specific CNs are of paramount importance for some studies. Therefore, we have developed a summarization method that estimates high-quality allele-specific CNs. RESULTS The proposed method estimates the allele-specific DNA CNs for all Affymetrix SNP arrays dealing directly with the cross hybridization between probes within SNP probesets. This algorithm outperforms (or at least it performs as well as) other state-of-the-art algorithms for computing DNA CNs. It better discerns an aberration from a normal state and it also gives more precise allele-specific CNs. AVAILABILITY The method is available in the open-source R package ACNE, which also includes an add on to the aroma.affymetrix framework (http://www.aroma-project.org/).

[1]  Andrzej Cichocki,et al.  Fast Nonnegative Matrix Factorization Algorithms Using Projected Gradient Approaches for Large-Scale Problems , 2008, Comput. Intell. Neurosci..

[2]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[3]  M. Ringnér,et al.  Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays , 2008, Genome Biology.

[4]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[5]  William B. Langdon,et al.  Probes containing runs of guanines provide insights into the biophysics and bioinformatics of Affymetrix GeneChips , 2008, Briefings Bioinform..

[6]  Michael Olivier,et al.  A novel procedure for genotyping of single nucleotide polymorphisms in trisomy with genomic DNA and the invader assay , 2008, Nucleic acids research.

[7]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[8]  Shigeru Chiba,et al.  A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. , 2005, Cancer research.

[9]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[10]  Emmanuel Barillot,et al.  ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays , 2008, Bioinform..

[11]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[12]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[13]  Terence P. Speed,et al.  A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6 , 2009, Bioinform..

[14]  Jun Luo,et al.  Copy Number Analysis Indicates Monoclonal Origin of Lethal Metastatic Prostate Cancer , 2009, Nature Medicine.

[15]  Cheng Li,et al.  Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application , 2001, Genome Biology.

[16]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy number variation in breast cancer using DNA microarrays , 1999, Nature Genetics.

[17]  David Harrington,et al.  PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. , 2007, Biostatistics.

[18]  Terence P. Speed,et al.  Estimation and assessment of raw copy numbers at the single locus level , 2008, Bioinform..

[19]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy-number changes using cDNA microarrays , 1999, Nature Genetics.

[20]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[21]  Li Li,et al.  High‐resolution genomic and expression analyses of copy number alterations in breast tumors , 2008, Genes, chromosomes & cancer.

[22]  James H. Bullard,et al.  aroma.affymetrix: A generic framework in R for analyzing small to very large Affymetrix data sets in bounded memory , 2008 .