Robust Preprocessing of Gene Expression Microarrays for Independent Component Analysis

Oligonucleotide Microarrays are useful tools in Genetic Research as they provide parallel scanning mechanisms to detect the presence of genes using test probes composed of controlled segments of gene code built by masking techniques. The detection of each gene depends on the multichannel differential expression of perfectly matched segments (PM) against mismatched ones (MM). This methodology, devised to robustify the detection process poses some interesting problems under the point of view of Genomic Signal Processing, as test probe expressions are not in good agreement with the proportionality assumption in most of the genes explored. These cases may be influenced by unexpected hybridization dynamics, and are worth of being studied with a double objective: gain insight into hybridization dynamics in microarrays, and to improve microarray production and processing as well. Recently Independent Component Analysis has been proposed to process microarrays. This promising technique requires the pre-processing of the microarray contents. The present work proposes the de-correlation of test probes based on probe structure rather than the classical “blind” whitening techniques currently used in ICA. Results confirm that this methodology may provide the correct alignment of the PM-MM pairs maintaining the underlying information present in the probe sets.

[1]  S. K. Moore Making chips to probe genes , 2001 .

[2]  U. Schmidt,et al.  Cancer diagnosis and microarrays. , 2003, The international journal of biochemistry & cell biology.

[3]  Andrzej Cichocki,et al.  Adaptive blind signal and image processing , 2002 .

[4]  Aapo Hyvärinen,et al.  The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis , 1999, Neural Processing Letters.

[5]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[6]  A. K. Whitchurch,et al.  Gene expression microarrays , 2002 .

[7]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[8]  D. Catlin Estimation, Control, and the Discrete Kalman Filter , 1988 .

[9]  Felix Naef,et al.  From features to expression: High-density oligonucleotide array analysis revisited , 2001 .

[10]  J. Marks,et al.  A SAGE (serial analysis of gene expression) view of breast tumor progression. , 2001, Cancer research.

[11]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  A. Venetsanopoulos,et al.  A multichannel order-statistic technique for cDNA microarray image processing , 2004, IEEE Transactions on NanoBioscience.

[13]  Dhammika Amaratunga,et al.  Exploration and Analysis of DNA Microarray and Protein Array Data , 2003, Wiley series in probability and statistics.

[14]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .