Fast Nonnegative Matrix Factorization and Its Application for Protein Fold Recognition

Linear and unsupervised dimensionality reduction via matrix factorization with nonnegativity constraints is studied. Because of these constraints, it stands apart from other linear dimensionality reduction methods. Here we explore nonnegative matrix factorization in combination with three nearest-neighbor classifiers for protein fold recognition. Since typically matrix factorization is iteratively done, convergence, can be slow. To speed up convergence, we perform feature scaling (normalization) prior to the beginning of iterations. This results in a significantly (more than 11 times) faster algorithm. Justification of why it happens is provided. Another modification of the standard nonnegative matrix factorization algorithm is concerned with combining two known techniques for mapping unseen data. This operation is typically necessary before classifying the data in low-dimensional space. Combining two mapping techniques can yield better accuracy than using either technique alone. The gains, however, depend on the state of the random number generator used for initialization of iterations, a classifier, and its parameters. In particular, when employing the best out of three classifiers and reducing the original dimensionality by around 30%, these gains can reach more than 4%, compared to the classification in the original, high-dimensional space.

[1]  Ioannis Pitas,et al.  Application of non-negative and local non negative matrix factorization to facial expression recognition , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[2]  Oleg Okun Feature Normalization and Selection for Protein Fold Recognition , 2004 .

[3]  Erkki Oja,et al.  A "nonnegative PCA" algorithm for independent component analysis , 2004, IEEE Transactions on Neural Networks.

[4]  P. Comon Independent Component Analysis , 1992 .

[5]  Baowen Xu,et al.  A constrained non-negative matrix factorization in information retrieval , 2003, Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications.

[6]  Richard J. Mammone,et al.  Use of non-negative matrix factorization for language model adaptation in a lecture transcription task , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  Lonce Wyse,et al.  NMF vs ICA for face recognition , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[8]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[9]  Chin-Teng Lin,et al.  Machine Learning for Multi-class Protein Fold Classification Based on Neural Networks with Feature Gating , 2003, ICANN.

[10]  Oleg G. Okun,et al.  NON-NEGATIVE MATRIX FACTORIZATION AND CLASSIFIERS : EXPERIMENTAL STUDY , .

[11]  Nanning Zheng,et al.  Non-negative matrix factorization for visual coding , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  Stan Z. Li,et al.  Local non-negative matrix factorization as a visual representation , 2002, Proceedings 2nd International Conference on Development and Learning. ICDL 2002.

[13]  Yunde Jia,et al.  FISHER NON-NEGATIVE MATRIX FACTORIZATION FOR LEARNING LOCAL FEATURES , 2004 .

[14]  Jordi Vitrià,et al.  Discriminant basis for object classification , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[15]  Guido Bologna,et al.  A comparison study on protein fold recognition , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..

[16]  Baowen Xu,et al.  Matrix dimensionality reduction for mining Web logs , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[17]  Yuanqing Li,et al.  Sparse representation of images using alternating linear programming , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[18]  Seungjin Choi,et al.  Non-negative component parts of sound for classification , 2003, Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795).

[19]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[20]  Xuegong Zhang,et al.  Kernel Nearest-Neighbor Algorithm , 2002, Neural Processing Letters.

[21]  Chin-Teng Lin,et al.  Recognition of Structure Classification of Protein Folding by NN and SVM Hierarchical Learning Architecture , 2003, ICANN.

[22]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[23]  Jordi Vitrià,et al.  Evaluation of distance metrics for recognition based on non-negative matrix factorization , 2003, Pattern Recognit. Lett..

[24]  Daniel D. Lee,et al.  Multiplicative Updates for Classification by Mixture Models , 2001, NIPS.

[25]  Infotech Oulu,et al.  Protein Fold Recognition with K-Local Hyperplane Distance Nearest Neighbor Algorithm , 2004 .

[26]  Lawrence K. Saul,et al.  Modeling distances in large-scale networks by matrix factorization , 2004, IMC '04.

[27]  P. Smaragdis,et al.  Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[28]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[29]  Kenji Kita,et al.  Dimensionality reduction using non-negative matrix factorization for information retrieval , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[30]  O. G. Okun K-Local hyperplane distance nearest neighbor algorithm and protein fold recognition , 2007, Pattern Recognition and Image Analysis.

[31]  Stan Z. Li,et al.  Learning representative local features for face detection , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[32]  Bernt Schiele,et al.  Introducing a weighted non-negative matrix factorization for image classification , 2003, Pattern Recognit. Lett..

[33]  Tim J. P. Hubbard,et al.  SCOP: a structural classification of proteins database , 1998, Nucleic Acids Res..

[34]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[35]  Szymon Rusinkiewicz,et al.  Efficient BRDF importance sampling using a factored representation , 2004, SIGGRAPH 2004.

[36]  Pascal Vincent,et al.  K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms , 2001, NIPS.

[37]  Wesley E. Snyder,et al.  Eigenviews for object recognition in multispectral imaging systems , 2003, 32nd Applied Imagery Pattern Recognition Workshop, 2003. Proceedings..

[38]  Sven Behnke,et al.  Discovering hierarchical speech features using convolutional non-negative matrix factorization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[39]  Chris H. Q. Ding,et al.  Multi-class protein fold recognition using support vector machines and neural networks , 2001, Bioinform..

[40]  I. Jolliffe Principal Component Analysis , 2002 .

[41]  Jonathan Foote,et al.  Summarizing video using non-negative similarity matrix factorization , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[42]  Nikhil R. Pal,et al.  Some New Features for Protein Fold Prediction , 2003, ICANN.