Transcript-based reannotation for microarray probesets

DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. This work intends to establish a basis for the integration of gene expression measurements from several manufacturers, a problem that can be addressed at different levels. We will focus on the reannotation process, a cornerstone of multi-platform integration. The proposed approach is based on a reannotation from probesets to transcripts, preserving valuable information for further analysis. Gene expression data from glioblastoma studies will be used as case studies, considering data from Agilent and Affymetrix platforms.

[1]  U. Naumann,et al.  Microarray Analysis in a Cell Death Resistant Glioma Cell Line to Identify Signaling Pathways and Novel Genes Controlling Resistance and Malignancy , 2011, Cancers.

[2]  Guy Perrière,et al.  Cross-platform comparison and visualisation of gene expression data using co-inertia analysis , 2003, BMC Bioinformatics.

[3]  G. Churchill,et al.  A comparison of cDNA, oligonucleotide, and Affymetrix GeneChip gene expression microarray platforms. , 2004, Journal of biomolecular techniques : JBT.

[4]  Jayant P. Menon,et al.  Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain. , 2006, Cancer cell.

[5]  Petri Auvinen,et al.  Are data from different gene expression microarray platforms comparable? , 2004, Genomics.

[6]  Pierre-Antoine Absil,et al.  Principal Manifolds for Data Visualization and Dimension Reduction , 2007 .

[7]  Andreas Huemer,et al.  Dimensionality Reduction and Microarray Data , 2008 .

[8]  Ping Wang,et al.  Application of Bioinformatics in the Design of Gene Expression Microarrays , 2006, Second International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (isola 2006).

[9]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[10]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[11]  Heng Li,et al.  Mapping the human reference genome's missing sequence by three-way admixture in Latino genomes. , 2013, American journal of human genetics.

[12]  P. Flicek,et al.  Consistent annotation of gene expression arrays , 2010, BMC Genomics.

[13]  Abhijit Waman Phatak,et al.  Effect of single nucleotide polymorphisms on Affymetrix® match-mismatch probe pairs , 2008, Bioinformation.

[14]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[15]  Giovanni Parmigiani,et al.  Pre-processing Agilent microarray data , 2007, BMC Bioinformatics.

[16]  Michel Bellis,et al.  Mapping of Affymetrix probe sets to groups of transcripts using transcriptional networks , 2012, 1201.2033.

[17]  Doron Lancet,et al.  Novel definition files for human GeneChips based on GeneAnnot , 2007, BMC Bioinformatics.

[18]  Sambasivarao Damaraju,et al.  Effects of Sample Size on Differential Gene Expression, Rank Order and Prediction Accuracy of a Gene Signature , 2013, PloS one.

[19]  Hui Yu,et al.  Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data , 2007, BMC Bioinformatics.

[20]  Qingzhong Liu,et al.  Microarray Gene Expression Classification Based on Supervised Learning and Similarity Measures , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[21]  Sung-Bae Cho,et al.  Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features , 2002, Proc. IEEE.