A discrete optimization approach for SVD best truncation choice based on ROC curves

Truncated Singular Value Decomposition (SVD) has always been a key algorithm in modern machine learning. Scientists and researchers use this applied mathematics method in many fields. Despite a long history and prevalence, the issue of how to choose the best truncation level still remains an open challenge. In this paper, we describe a new algorithm, akin a the discrete optimization method, that relies on the Receiver Operating Characteristics (ROC) Areas Under the Curve (AUCs) computation. We explore a concrete application of the algorithm to a bioinformatics problem, i.e. the prediction of biomolecular annotations. We applied the algorithm to nine different datasets and the obtained results demonstrate the effectiveness of our technique.

[1]  Purvesh Khatri,et al.  A semantic analysis of the annotations of the human genome , 2005, Bioinform..

[2]  Marco Masseroli,et al.  Integration of Biomolecular Interaction Data in a Genomic and Proteomic Data Warehouse to Support Biomedical Knowledge Discovery , 2011, CIBB.

[3]  Joachim M. Buhmann,et al.  Selecting the rank of truncated SVD by maximum approximation capacity , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[4]  Marco Masseroli,et al.  Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annotations , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[5]  Izzat Darwazeh,et al.  A Truncated SVD approach for fixed complexity spectrally efficient FDM receivers , 2011, 2011 IEEE Wireless Communications and Networking Conference.

[6]  Marco Tagliasacchi,et al.  Genomic Annotation Prediction Based on Integrated Information , 2011, CIBB.

[7]  P. Hansen The discrete picard condition for discrete ill-posed problems , 1990 .

[8]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[9]  Gaurav Pandey,et al.  Computational Approaches for Protein Function Prediction : A Survey , 2006 .

[10]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[11]  C. Vogel Optimal choice of a truncation level for the truncated SVD solution of linear first kind integral equations when data are noisy , 1986 .

[12]  Khalide Jbilou,et al.  Vector extrapolation enhanced TSVD for linear discrete ill-posed problems , 2009, Numerical Algorithms.