Visualization and analysis of molecular data.

This chapter provides an overview of visualization and analysis techniques applied to large-scale datasets from genomics, metabolomics, and proteomics. The aim is to reduce the number of variables (genes, metabolites, or proteins) by extracting a small set of new relevant variables, usually termed components. The advantages and disadvantages of the classical principal component analysis (PC A) are discussed and a link is given to the closely related singular value decomposition and multidimensional scaling. Special emphasis is given to the recent trend toward the use of independent component analysis, which aims to extract statistically independent components and, therefore, provides usually more meaningful components than PCA. We also discuss normalization techniques and their influence on the result of different analytical techniques.

[1]  Li Liu,et al.  Robust singular value decomposition analysis of microarray data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[2]  S. Batzoglou,et al.  Application of independent component analysis to microarrays , 2003, Genome Biology.

[3]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[4]  Christopher J. C. Burges,et al.  Geometric Methods for Feature Extraction and Dimensional Reduction , 2005 .

[5]  Andreas Ziehe,et al.  TDSEP { an e(cid:14)cient algorithm for blind separation using time structure , 1998 .

[6]  T. Sejnowski,et al.  Dynamic Brain Sources of Visual Evoked Responses , 2002, Science.

[7]  James V. Stone Independent Component Analysis: A Tutorial Introduction , 2007 .

[8]  A. J. Bell,et al.  INDEPENDENT COMPONENT ANALYSIS OF BIOMEDICAL SIGNALS , 2000 .

[9]  E. Oja,et al.  Independent Component Analysis , 2013 .

[10]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[11]  Laurenz Wiskott,et al.  CuBICA: independent component analysis by simultaneous third- and fourth-order cumulant diagonalization , 2004, IEEE Transactions on Signal Processing.

[12]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[13]  Barak A. Pearlmutter,et al.  Independent Components of Magnetoencephalography: Localization , 2002, Neural Computation.

[14]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  David P. Kreil,et al.  Independent component analysis of microarray data in the study of endometrial cancer , 2004, Oncogene.

[16]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[17]  Asoke K. Nandi,et al.  Blind Source Separation , 1999 .

[18]  Gene H. Golub,et al.  Matrix computations , 1983 .

[19]  Wolfram Liebermeister,et al.  Linear modes of gene expression determined by independent component analysis , 2002, Bioinform..

[20]  Terence D. Sanger,et al.  Optimal unsupervised learning in a single-layer linear feedforward neural network , 1989, Neural Networks.

[21]  Kurt Hornik,et al.  Learning in linear neural networks: a survey , 1995, IEEE Trans. Neural Networks.

[22]  Sun-Yuan Kung,et al.  Principal Component Neural Networks: Theory and Applications , 1996 .

[23]  David J. C. MacKay,et al.  A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer , 2002, Bioinform..

[24]  Erkki Oja,et al.  Independent component approach to the analysis of EEG and MEG recordings , 2000, IEEE Transactions on Biomedical Engineering.

[25]  Neal S. Holter,et al.  Fundamental patterns underlying gene expression profiles: simplicity from complexity. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[27]  Joachim Selbig,et al.  Metabolite fingerprinting: detecting biological features by independent component analysis , 2004, Bioinform..

[28]  I. Jolliffe Principal Component Analysis , 2002 .

[29]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[30]  J. Cardoso,et al.  Blind beamforming for non-gaussian signals , 1993 .

[31]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[32]  James V. Stone Independent component analysis: an introduction , 2002, Trends in Cognitive Sciences.

[33]  Christopher J. C. Burges,et al.  Geometric Methods for Feature Extraction and Dimensional Reduction - A Guided Tour , 2005, Data Mining and Knowledge Discovery Handbook.

[34]  Joachim Selbig,et al.  Independent components analysis of starch deficient pgm mutants , 2004, German Conference on Bioinformatics.

[35]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .