Finding recurrent copy number alterations preserving within-sample homogeneity

MOTIVATION Copy number alterations (CNAs) represent an important component of genetic variation and play a significant role in many human diseases. Development of array comparative genomic hybridization (aCGH) technology has made it possible to identify CNAs. Identification of recurrent CNAs represents the first fundamental step to provide a list of genomic regions which form the basis for further biological investigations. The main problem in recurrent CNAs discovery is related to the need to distinguish between functional changes and random events without pathological relevance. Within-sample homogeneity represents a common feature of copy number profile in cancer, so it can be used as additional source of information to increase the accuracy of the results. Although several algorithms aimed at the identification of recurrent CNAs have been proposed, no attempt of a comprehensive comparison of different approaches has yet been published. RESULTS We propose a new approach, called Genomic Analysis of Important Alterations (GAIA), to find recurrent CNAs where a statistical hypothesis framework is extended to take into account within-sample homogeneity. Statistical significance and within-sample homogeneity are combined into an iterative procedure to extract the regions that likely are involved in functional changes. Results show that GAIA represents a valid alternative to other proposed approaches. In addition, we perform an accurate comparison by using two real aCGH datasets and a carefully planned simulation study. AVAILABILITY GAIA has been implemented as R/Bioconductor package. It can be downloaded from the following page http://bioinformatics.biogem.it/download/gaia. CONTACT ceccarelli@unisannio.it; morganella@unisannio.it. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  O. Olopade,et al.  The I1307K APC polymorphism in Ashkenazi Jews with colorectal cancer: clinical and pathologic features. , 2006, Cancer genetics and cytogenetics.

[2]  Emmanuel Barillot,et al.  Analysis of array CGH data: from signal ratio to gain and loss of DNA regions , 2004, Bioinform..

[3]  W. Grady Making the case for DCC and UNC5C as tumor-suppressor genes in the colon. , 2007, Gastroenterology.

[4]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[5]  E. Lander,et al.  Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma , 2007, Proceedings of the National Academy of Sciences.

[6]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[7]  Derek Y. Chiang,et al.  The landscape of somatic copy-number alteration across human cancers , 2010, Nature.

[8]  Céline Rouveirol,et al.  Bioinformatics Original Paper Computation of Recurrent Minimal Genomic Alterations from Array-cgh Data , 2022 .

[9]  L. Chin,et al.  High-resolution characterization of the pancreatic adenocarcinoma genome , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[10]  W. Ayre,et al.  Alkaline phosphatase in benign and malignant cells in the vaginal smear , 1951, Cancer.

[11]  M. Heinrich,et al.  A molecular portrait of gastrointestinal stromal tumors: an integrative analysis of gene expression profiling and high-resolution genomic copy number , 2010, Laboratory Investigation.

[12]  S. Shah,et al.  Computational methods for identification of recurrent copy number alteration patterns by array CGH , 2009, Cytogenetic and Genome Research.

[13]  Michele Ceccarelli,et al.  VEGA: variational segmentation for copy number detection , 2010, Bioinform..

[14]  Arianna Di Napoli,et al.  Patterns of gene expression and copy-number alterations in von-hippel lindau disease-associated and sporadic clear cell carcinoma of the kidney. , 2009, Cancer research.

[15]  Jane Fridlyand,et al.  High-resolution analysis of DNA copy number alterations in colorectal cancer by array-based comparative genomic hybridization. , 2004, Carcinogenesis.

[16]  Antonio Ortega,et al.  Joint estimation of copy number variation and reference intensities on multiple DNA arrays using GADA , 2009, Bioinform..

[17]  Andrew B. Nobel,et al.  DiNAMIC: a method to identify recurrent DNA copy number aberrations in tumors , 2010, Bioinform..

[18]  M. Moroni,et al.  Epidermal growth factor receptor gene copy number and clinical outcome of metastatic colorectal cancer treated with panitumumab. , 2007, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[19]  Kesheng Wang,et al.  A Bayesian segmentation approach to ascertain copy number variations at the population level , 2009, Bioinform..

[20]  Wessel N. van Wieringen,et al.  CGHregions: Dimension Reduction for Array CGH Data with Minimal Information Loss , 2007 .

[21]  Simon Tavaré,et al.  CNAnova: a new approach for finding recurrent copy number abnormalities in cancer SNP microarray data , 2010, Bioinform..

[22]  J. Booth,et al.  Resampling-Based Multiple Testing. , 1994 .

[23]  A. Shlien,et al.  Copy number variations and cancer , 2009, Genome Medicine.

[24]  Ajay N. Jain,et al.  Genome-wide-array-based comparative genomic hybridization reveals genetic homogeneity and frequent copy number increases encompassing CCNE1 in Fallopian tube carcinoma , 2003, Oncogene.

[25]  M. Wigler,et al.  Circular binary segmentation for the analysis of array-based DNA copy number data. , 2004, Biostatistics.

[26]  Jeroen de Ridder,et al.  Identification of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data , 2008, Nucleic acids research.

[27]  References , 1971 .

[28]  J. Lee,et al.  Loss of heterozygosity on chromosome 22q in gastrointestinal stromal tumors (GISTs): a study on 50 cases , 2005, Laboratory Investigation.

[29]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[30]  A. Llombart‐Bosch,et al.  Array comparative genomic hybridization analysis of chromosomal imbalances and their target genes in gastrointestinal stromal tumors , 2007, Genes, chromosomes & cancer.

[31]  Oscar M. Rueda and Ramon Diaz-Uriarte Finding Recurrent Copy Number Alteration Regions: A Review of Methods , 2010 .

[32]  Dana Pe'er,et al.  JISTIC: Identification of Significant Targets in Cancer , 2010, BMC Bioinformatics.

[33]  D. Albertson,et al.  Chromosome aberrations in solid tumors , 2003, Nature Genetics.

[34]  G. Sauter,et al.  Patterns of gene amplification in gastrointestinal stromal tumors (GIST) , 2005, Laboratory Investigation.

[35]  S. Ackerman,et al.  Inactivation of the UNC5C Netrin-1 receptor is associated with tumor progression in colorectal malignancies. , 2007, Gastroenterology.

[36]  Charis Eng,et al.  Highly penetrant hereditary cancer syndromes , 2004, Oncogene.

[37]  J. Lupski Structural variation in the human genome. , 2007, The New England journal of medicine.

[38]  Nancy R. Zhang,et al.  Joint estimation of DNA copy number from multiple platforms , 2010, Bioinform..

[39]  A. V. van Kessel,et al.  Identification of candidate predisposing copy number variants in familial and early‐onset colorectal cancer patients , 2011, International journal of cancer.

[40]  Ramón Díaz-Uriarte,et al.  Detection of recurrent copy number alterations in the genome: taking among-subject heterogeneity seriously , 2009, BMC Bioinformatics.

[41]  C. Sander,et al.  Functional Copy-Number Alterations in Cancer , 2008, PloS one.

[42]  後藤 哲宏 Aberrant methylation of the p16 gene is frequently detected in advanced colorectal cancer , 2009 .

[43]  M. Pierotti,et al.  9p21 locus analysis in high‐risk gastrointestinal stromal tumors characterized for c‐kit and platelet‐derived growth factor receptor α gene alterations , 2005, Cancer.

[44]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[45]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.