Locally Adaptive Statistical Procedures for the Integrative Analysis on Genomic and Transcriptional Data

The systematic integration of expression profiles and other types of gene information, such as copy number, chromosomal localization, and sequence characteristics, still represents a challenge in the genomic arena. In particular, the integrative analysis of genomic and transcriptional data in context of the physical location of genes in a genome appears promising in detecting chromosomal regions with structural and transcriptional imbalances often characterizing cancer. A computational framework based on locally adaptive statistical procedures (Global Smoothing Copy Number, GLSCN, and Locally Adaptive Statistical Procedure, LAP), which incorporate genomic and transcriptional data with structural information for the identification of imbalanced chromosomal regions, is described. Both GLSCN and LAP accounts for variations in the distance between genes and in gene density by smoothing standard statistics on gene position before testing the significance of copy number and gene expression signals. The application of GLSCN and LAP to the integrative analysis of a human metastatic clear cell renal carcinoma cell line (Caki-1) allowed identifying chromosomal regions that are directly involved in known chromosomal aberrations characteristic of tumors.

[1]  Yuri Kotliarov,et al.  High-resolution global genomic survey of 178 gliomas reveals novel regions of copy number alteration and allelic imbalances. , 2006, Cancer research.

[2]  F. Baas,et al.  The Human Transcriptome Map: Clustering of Highly Expressed Genes in Chromosomal Domains , 2001, Science.

[3]  M. Shapero,et al.  High-resolution analysis of DNA copy number using oligonucleotide microarrays. , 2004, Genome research.

[4]  Benjamin Georgi,et al.  BIOINFORMATICS APPLICATIONS NOTE Gene expression MACAT—microarray chromosome analysis tool , 2022 .

[5]  Céline Rouveirol,et al.  VAMP: Visualization and analysis of array-CGH, transcriptome and other molecular profiles , 2006, Bioinform..

[6]  Eva Herrmann,et al.  Local Bandwidth Choice in Kernel Regression Estimation , 1997 .

[7]  Debashis Ghosh,et al.  A model-based scan statistic for identifying extreme chromosomal regions of gene expression in human tumors , 2005, Bioinform..

[8]  K. Furge,et al.  Identification of frequent cytogenetic aberrations in hepatocellular carcinoma using gene-expression microarray data , 2002, Genome Biology.

[9]  A. Callegaro,et al.  A locally adaptive statistical procedure (LAP) to identify differentially expressed chromosomal regions , 2006, Bioinform..

[10]  Michael A. Beer,et al.  Predicting Gene Expression from Sequence , 2004, Cell.

[11]  H. Müller,et al.  Kernel estimation of regression functions , 1979 .

[12]  T. Golub,et al.  Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma , 2005, Nature.

[13]  H. Bussemaker,et al.  The human transcriptome map reveals extremes in gene density, intron length, GC content, and repeat pattern for domains of highly and weakly expressed genes. , 2003, Genome research.

[14]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Robert Tibshirani,et al.  Statistical Significance for Genome-Wide Experiments , 2003 .

[17]  Theo Gasser,et al.  Smoothing Techniques for Curve Estimation , 1979 .

[18]  Eytan Domany,et al.  Relationship of gene expression and chromosomal abnormalities in colorectal cancer. , 2006, Cancer research.