Normalization of array-CGH data: influence of copy number imbalances

BackgroundHigh-resolution microarray-based comparative genomic hybridization (CGH) techniques have successfully been applied to study copy number imbalances in a number of settings such as the analysis of cancer genomes. For normalization of array-CGH data, methods initially developed for gene expression microarray analysis have, in general, been directly adopted and used. However, these methods are designed to work under assumptions that may not be valid for array-CGH data when copy number imbalances are present. We therefore sought to investigate the effect on normalization imposed by copy number imbalances.ResultsHere we demonstrate that copy number imbalances correlate with intensity in array-CGH data thereby causing problems for conventional normalization methods. We propose a strategy to circumvent these problems by taking copy number imbalances into account during normalization, and we test the proposed strategy using several data sets from the analysis of cancer genomes. In addition, we show how the strategy can be applied to conveniently define adaptive sample-specific boundaries between balanced copy number, losses, and gains to facilitate management of variation in tissue heterogeneity when calling copy number changes.ConclusionWe highlight the importance of considering copy number imbalances during normalization of array-CGH data, and show how failure to do so can deleteriously affect data and hamper interpretation.

[1]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[2]  B J Williams,et al.  Comparative genomic hybridization. , 1996, Methods in molecular medicine.

[3]  Johan Staaf,et al.  High‐resolution genomic profiles of breast cancer cell lines assessed by tiling BAC array comparative genomic hybridization , 2007, Genes, chromosomes & cancer.

[4]  Tara L. Naylor,et al.  Distinct genomic profiles in hereditary breast tumors identified by array-based comparative genomic hybridization. , 2005, Cancer research.

[5]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[6]  Kenny Q. Ye,et al.  Novel patterns of genome rearrangement and their association with survival in breast cancer. , 2006, Genome research.

[7]  W. Lam,et al.  Comprehensive copy number profiles of breast cancer cell model genomes , 2006, Breast Cancer Research.

[8]  Jaakko Astola,et al.  CGH-Plotter: MATLAB toolbox for CGH-data analysis , 2003, Bioinform..

[9]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[10]  Emmanuel Barillot,et al.  Spatial normalization of array-CGH data , 2006, BMC Bioinformatics.

[11]  S. Lam,et al.  High resolution analysis of non‐small cell lung cancer cell lines by whole genome tiling path array CGH , 2006, International journal of cancer.

[12]  Ajay N. Jain,et al.  Breast tumor copy number aberration phenotypes and genomic instability , 2006, BMC Cancer.

[13]  B. Ylstra,et al.  BAC to the future! or oligonucleotides: a perspective for micro array comparative genomic hybridization (array CGH) , 2006, Nucleic acids research.

[14]  Randy D Gascoyne,et al.  Comprehensive whole genome array CGH profiling of mantle cell lymphoma model genomes. , 2004, Human molecular genetics.

[15]  S. Gruvberger,et al.  BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray data , 2002, Genome Biology.

[16]  Alicia Oshlack,et al.  Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes , 2007, Genome Biology.

[17]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[18]  Å. Borg,et al.  Identification of cryptic aberrations and characterization of translocation breakpoints using array CGH in high hyperdiploid childhood acute lymphoblastic leukemia , 2006, Leukemia.

[19]  Han G Brunner,et al.  Identification of disease genes by whole genome CGH arrays. , 2005, Human molecular genetics.

[20]  I. Hedenfalk,et al.  Characterization of a Novel Breast Carcinoma Xenograft and Cell Line Derived from a BRCA1 Germ-Line Mutation Carrier , 2003, Laboratory Investigation.

[21]  Ash A. Alizadeh,et al.  Genome-wide analysis of DNA copy-number changes using cDNA microarrays , 1999, Nature Genetics.

[22]  D. Pinkel,et al.  Comparative Genomic Hybridization for Molecular Cytogenetic Analysis of Solid Tumors , 2022 .

[23]  Rabab Kreidieh Ward,et al.  BMC Bioinformatics Methodology article A stepwise framework for the normalization of array CGH data , 2005 .

[24]  Samuel S. Wu,et al.  A statistical method for flagging weak spots improves normalization and ratio estimates in microarrays. , 2001, Physiological genomics.

[25]  Elena Marchiori,et al.  Breakpoint identification and smoothing of array comparative genomic hybridization data , 2004, Bioinform..

[26]  Ajay N. Jain,et al.  Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. , 2006, Cancer cell.