A New Orthogonal Discriminant Projection Based Prediction Method for Bioinformatic Data

DNA microarray allows the measurement of transcript abundances for thousands of genes in parallel. Though, it is an important procedure to select informative genes related to tumor from those gene expression profiles (GEP) because of its characteristics such as high dimensionality, small sample set and many noises. In this paper we proposed a novel method for feature extraction that is named as Orthogonal Discriminant Projection (ODP). This method is a linear approximation base on manifold learning approach. The ODP method characterizes the local and non-local information of manifold distributed data and explores an optimum subspace which can maximize the difference between non-local scatter and the local scatter. Moreover, it introduces the class information to enhance the recognition ability. A trick has been employed to handle the Small Sample Site (SSS). Experimental results on Non-small Cell Lung Cancer (NSCLC) and glioma dataset validates its efficiency compared to other widely used dimensionality reduction methods such as Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA).

[1]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[2]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[3]  Mohammed Bennamoun,et al.  1D-PCA, 2D-PCA to nD-PCA , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[4]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[5]  Danh V. Nguyen,et al.  Tumor classification by partial least squares using microarray gene expression data , 2002, Bioinform..

[6]  Honggang Zhang,et al.  Comments on "Globally Maximizing, Locally Minimizing: Unsupervised Discriminant Projection with Application to Face and Palm Biometrics" , 2007, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Zhongliang Jing,et al.  Local structure based supervised feature extraction , 2006, Pattern Recognit..

[8]  De-Shuang Huang,et al.  Independent component analysis-based penalized discriminant method for tumor classification using gene expression data , 2006, Bioinform..

[9]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[11]  Thomas F. Lee Genes and Disease , 1993 .

[12]  J. G. Liao,et al.  Logistic regression for disease classification using microarray data: model selection in a large p and small n case , 2007, Bioinform..

[13]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Sophie Lambert-Lacroix,et al.  Effective dimension reduction methods for tumor classification using gene expression data , 2003, Bioinform..