Classification of Large Biomedical Data Using ANNs Based on BFGS Method

Artificial neural networks have been widely used for knowledge extraction from biomedical datasets and constitute an important role in bio-data exploration and analysis.In this work, we proposed a new curvilinear algorithm for training large neural networks which is based on the analysis of the eigenstructure of the memoryless BFGS matrices. The proposed method preserves the strong convergence properties provided by the quasi-Newton direction while simultaneously it exploits the nonconvexity of the error surface through the computation of the negative curvature direction without using any storage and matrix factorization.Moreover, for improving the generalization capability of trained ANNs, we explore the incorporation of several dimensionality reduction techniques as a pre-processing step.

[1]  Danny C. Sorensen,et al.  On the use of directions of negative curvature in a modified newton method , 1979, Math. Program..

[2]  Yonggwan Won,et al.  DNA Microarray Classification with Compact Single Hidden-Layer FeedForward Neural Networks , 2007, 2007 Frontiers in the Convergence of Bioscience and Information Technologies.

[3]  T. Poggio,et al.  Prediction of central nervous system embryonal tumour outcome based on gene expression , 2002, Nature.

[4]  Juan E. Gilbert,et al.  Automating Microarray Classification Using General Regression Neural Networks , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[5]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[6]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[7]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[8]  Its'hak Dinstein,et al.  A comparative study of neural network based feature extraction paradigms , 1999, Pattern Recognit. Lett..

[9]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[10]  Panayiotis E. Pintelas,et al.  Solving the quadratic trust-region subproblem in a low-memory BFGS framework , 2008, Optim. Methods Softw..

[11]  David A. Landgrebe,et al.  Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[12]  H. Zha,et al.  Principal manifolds and nonlinear dimensionality reduction via tangent space alignment , 2004, SIAM J. Sci. Comput..

[13]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[15]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[16]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[17]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[18]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[19]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[20]  Kilian Q. Weinberger,et al.  Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.

[21]  Eric O. Postma,et al.  Dimensionality Reduction: A Comparative Review , 2008 .

[22]  Daniel Thalmann,et al.  Planar arrangement of high-dimensional biomedical data sets by isomap coordinates , 2003, 16th IEEE Symposium Computer-Based Medical Systems, 2003. Proceedings..

[23]  Todd,et al.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning , 2002, Nature Medicine.

[24]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[25]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[26]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[27]  L. Armijo Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .

[28]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[29]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[30]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[31]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.