Bioinformatics Original Paper Computation of Recurrent Minimal Genomic Alterations from Array-cgh Data

MOTIVATION The identification of recurrent genomic alterations can provide insight into the initiation and progression of genetic diseases, such as cancer. Array-CGH can identify chromosomal regions that have been gained or lost, with a resolution of approximately 1 mb, for the cutting-edge techniques. The extraction of discrete profiles from raw array-CGH data has been studied extensively, but subsequent steps in the analysis require flexible, efficient algorithms, particularly if the number of available profiles exceeds a few tens or the number of array probes exceeds a few thousands. RESULTS We propose two algorithms for computing minimal and minimal constrained regions of gain and loss from discretized CGH profiles. The second of these algorithms can handle additional constraints describing relevant regions of copy number change. We have validated these algorithms on two public array-CGH datasets. AVAILABILITY From the authors, upon request. CONTACT celine@lri.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Stan Matwin,et al.  Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases , 2007 .

[2]  L. Beran,et al.  [Formal concept analysis]. , 1996, Casopis lekaru ceskych.

[3]  D. Pinkel,et al.  Array comparative genomic hybridization and its applications in cancer , 2005, Nature Genetics.

[4]  Rolph Pfundt,et al.  Novel chromosomal imbalances in mantle cell lymphoma detected by genome-wide array-based comparative genomic hybridization. , 2005, Blood.

[5]  Bradley P. Coe,et al.  A tiling resolution DNA microarray with complete coverage of the human genome , 2004, Nature Genetics.

[6]  L. Chin,et al.  High-resolution characterization of the pancreatic adenocarcinoma genome , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[8]  Christina Hulsbergen-van de Kaa,et al.  Identification of recurrent chromosomal aberrations in germ cell tumors of neonates and infants using genomewide array‐based comparative genomic hybridization , 2005, Genes, chromosomes & cancer.

[9]  K. Kinzler,et al.  Cancer genes and the pathways they control , 2004, Nature Medicine.

[10]  Jian Pei,et al.  Mining sequential patterns with constraints in large databases , 2002, CIKM '02.

[11]  Emmanuel Barillot,et al.  Analysis of array CGH data: from signal ratio to gain and loss of DNA regions , 2004, Bioinform..

[12]  L. Chin,et al.  High-resolution genomic profiles of human lung cancer. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  John Quackenbush,et al.  CGHAnalyzer: a stand-alone software package for cancer genome analysis using array-based DNA copy number data , 2005, Bioinform..

[14]  W. Kuo,et al.  High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays , 1998, Nature Genetics.

[15]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[16]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[18]  Aristides Gionis,et al.  Geometric and Combinatorial Tiles in 0-1 Data , 2004, PKDD.

[19]  Wen-Lin Kuo,et al.  Array-based comparative genomic hybridization for genome-wide screening of DNA copy number in bladder tumors. , 2003, Cancer research.

[20]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[21]  B. Vogelstein,et al.  A genetic model for colorectal tumorigenesis , 1990, Cell.

[22]  Céline Rouveirol,et al.  Local Pattern Discovery in Array-CGH Data , 2004, Local Pattern Detection.

[23]  H. Döhner,et al.  Matrix‐based comparative genomic hybridization: Biochips to screen for genomic imbalances , 1997, Genes, chromosomes & cancer.

[24]  R. Tibshirani,et al.  A method for calling gains and losses in array CGH data. , 2005, Biostatistics.

[25]  Luc De Raedt,et al.  The Levelwise Version Space Algorithm and its Application to Molecular Fragment Finding , 2001, IJCAI.

[26]  Jane Fridlyand,et al.  High-resolution analysis of DNA copy number alterations in colorectal cancer by array-based comparative genomic hybridization. , 2004, Carcinogenesis.

[27]  Qing-Rong Chen,et al.  Detection of low level genomic alterations by comparative genomic hybridization based on cDNA micro-arrays , 2005, Bioinform..

[28]  Xing Chen,et al.  Visualization-based discovery and analysis of genomic aberrations in microarray data , 2005, BMC Bioinformatics.

[29]  Christian Pilarsky,et al.  High-resolution analysis of chromosomal imbalances using the Affymetrix 10K SNP genotyping chip. , 2005, Genomics.

[30]  S. P. Fodor,et al.  Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays , 2004, Nature Methods.

[31]  Anthony K. H. Tung,et al.  Carpenter: finding closed patterns in long biological datasets , 2003, KDD '03.

[32]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[33]  Jean-François Boulicaut,et al.  Constraint-based concept mining and its application to microarray data analysis , 2005, Intell. Data Anal..

[34]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[35]  J. Sebat,et al.  Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. , 2003, Genome research.

[36]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[37]  Katharina Morik,et al.  Local Pattern Detection, International Seminar, Dagstuhl Castle, Germany, April 12-16, 2004, Revised Selected Papers , 2005, Local Pattern Detection.

[38]  Randy D Gascoyne,et al.  Comprehensive whole genome array CGH profiling of mantle cell lymphoma model genomes. , 2004, Human molecular genetics.