Kernel based nonlinear dimensionality reduction for microarray gene expression data analysis

Accurate recognition of cancers based on microarray gene expressions is very important for doctors to choose a proper treatment. Genomic microarrays are powerful research tools in bioinformatics and modern medicinal research. However, a simple microarray experiment often leads to very high-dimensional data and a huge amount of information, the vast amount of data challenges researchers into extracting the important features and reducing the high dimensionality. This paper proposed the kernel method based locally linear embedding to selecting the optimal number of nearest neighbors, constructing uniform distribution manifold. In this paper, a nonlinear dimensionality reduction kernel method based locally linear embedding is proposed to select the optimal number of nearest neighbors, constructing uniform distribution manifold. In addition, support vector machine which has given rise to the development of a new class of theoretically elegant learning machines will be used to classify and recognise genomic microarray. We demonstrate the application of the techniques to two published DNA microarray data sets. The experimental results and comparisons demonstrate that the proposed method is effective approach.

[1]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[3]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[4]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[5]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[6]  U. Scherf,et al.  Large-scale gene expression analysis in molecular target discovery , 2002, Leukemia.

[7]  Yoonkyung Lee,et al.  Classification of Multiple Cancer Types by Multicategory Support Vector Machines Using Gene Expression Data , 2003, Bioinform..

[8]  Jun Wang,et al.  Reconstruction and analysis of multi-pose face images based on nonlinear dimensionality reduction , 2004, Pattern Recognit..

[9]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[10]  John K. Tsotsos,et al.  Face recognition with weighted locally linear embedding , 2005, The 2nd Canadian Conference on Computer and Robot Vision (CRV'05).

[11]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[12]  P. Brown,et al.  A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. , 1996, Genome research.

[13]  R. Young,et al.  Biomedical Discovery with DNA Arrays , 2000, Cell.

[14]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[16]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Michel Verleysen,et al.  Nonlinear dimensionality reduction of data manifolds with essential loops , 2005, Neurocomputing.

[19]  Trevor Hastie,et al.  Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .

[20]  Nikhil R. Pal,et al.  Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering , 2007, BMC Bioinformatics.

[21]  Hongyuan Zha,et al.  Principal Manifolds and Nonlinear Dimension Reduction via Local Tangent Space Alignment , 2002, ArXiv.

[22]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[23]  Kuldip K. Paliwal,et al.  Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition , 2003, Pattern Recognit..

[24]  Matti Pietikäinen,et al.  Selection of the Optimal Parameter value for the Locally Linear Embedding Algorithm , 2002, FSKD.

[25]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[26]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[27]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[28]  Jianbo Shi,et al.  Learning Segmentation by Random Walks , 2000, NIPS.

[29]  Chao-Ton Su,et al.  Feature selection for the SVM: An application to hypertension diagnosis , 2008, Expert Syst. Appl..

[30]  Tomaso Poggio,et al.  Multiclass Classification of SRBCTs , 2001 .

[31]  I. Jolliffe Principal Component Analysis , 2002 .

[32]  A. Elgammal,et al.  Separating style and content on a nonlinear manifold , 2004, CVPR 2004.

[33]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[34]  M. Ellis,et al.  Development and validation of a method for using breast core needle biopsies for gene expression microarray analyses. , 2002, Clinical cancer research : an official journal of the American Association for Cancer Research.