Drug repositioning based on bounded nuclear norm regularization

Abstract Motivation Computational drug repositioning is a cost-effective strategy to identify novel indications for existing drugs. Drug repositioning is often modeled as a recommendation system problem. Taking advantage of the known drug–disease associations, the objective of the recommendation system is to identify new treatments by filling out the unknown entries in the drug–disease association matrix, which is known as matrix completion. Underpinned by the fact that common molecular pathways contribute to many different diseases, the recommendation system assumes that the underlying latent factors determining drug–disease associations are highly correlated. In other words, the drug–disease matrix to be completed is low-rank. Accordingly, matrix completion algorithms efficiently constructing low-rank drug–disease matrix approximations consistent with known associations can be of immense help in discovering the novel drug–disease associations. Results In this article, we propose to use a bounded nuclear norm regularization (BNNR) method to complete the drug–disease matrix under the low-rank assumption. Instead of strictly fitting the known elements, BNNR is designed to tolerate the noisy drug–drug and disease–disease similarities by incorporating a regularization term to balance the approximation error and the rank properties. Moreover, additional constraints are incorporated into BNNR to ensure that all predicted matrix entry values are within the specific interval. BNNR is carried out on an adjacency matrix of a heterogeneous drug–disease network, which integrates the drug–drug, drug–disease and disease–disease networks. It not only makes full use of available drugs, diseases and their association information, but also is capable of dealing with cold start naturally. Our computational results show that BNNR yields higher drug–disease association prediction accuracy than the current state-of-the-art methods. The most significant gain is in prediction precision measured as the fraction of the positive predictions that are truly positive, which is particularly useful in drug design practice. Cases studies also confirm the accuracy and reliability of BNNR. Availability and implementation The code of BNNR is freely available at https://github.com/BioinformaticsCSU/BNNR. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[2]  Charles C. Persinger,et al.  How to improve R&D productivity: the pharmaceutical industry's grand challenge , 2010, Nature Reviews Drug Discovery.

[3]  Emmanuel J. Candès,et al.  Simple bounds for recovering low-complexity models , 2011, Math. Program..

[4]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[5]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[6]  R. Sharan,et al.  PREDICT: a method for inferring novel drug indications with application to personalized medicine , 2011, Molecular systems biology.

[7]  Xiaoming Yuan,et al.  Matrix completion via an alternating direction method , 2012 .

[8]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[9]  Armando Blanco,et al.  DrugNet: Network-based drug-disease prioritization by integrating heterogeneous data , 2015, Artif. Intell. Medicine.

[10]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[11]  Xuelong Li,et al.  Fast and Accurate Matrix Completion via Truncated Nuclear Norm Regularization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yi Pan,et al.  Drug repositioning based on comprehensive similarity measures and Bi-Random walk algorithm , 2016, Bioinform..

[13]  David S. Wishart,et al.  DrugBank: a comprehensive resource for in silico drug discovery and exploration , 2005, Nucleic Acids Res..

[14]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2013 , 2012, Nucleic Acids Res..

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[17]  Jing Li,et al.  Drug Target Predictions Based on Heterogeneous Graph Inference , 2012, Pacific Symposium on Biocomputing.

[18]  Yaohang Li,et al.  A survey of matrix completion methods for recommendation systems , 2018, Big Data Min. Anal..

[19]  C. Chong,et al.  New uses for old drugs , 2007, Nature.

[20]  Wotao Yin,et al.  Alternating direction augmented Lagrangian methods for semidefinite programming , 2010, Math. Program. Comput..

[21]  Shiqian Ma,et al.  Fixed point and Bregman iterative methods for matrix rank minimization , 2009, Math. Program..

[22]  Junfeng Yang,et al.  Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization , 2012, Math. Comput..

[23]  Yaohang Li,et al.  Computational drug repositioning using low-rank matrix approximation and randomized algorithms , 2018, Bioinform..

[24]  Jianxin Chen,et al.  Matrix Factorization-Based Prediction of Novel Drug Indications by Integrating Genomic Space , 2015, Comput. Math. Methods Medicine.

[25]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[26]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[27]  Xiang Zhang,et al.  Drug repositioning by integrating target information through a heterogeneous network model , 2014, Bioinform..

[28]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[29]  Yaohang Li,et al.  A Fast Implementation of Singular Value Thresholding Algorithm using Recycling Rank Revealing Randomized Singular Value Decomposition , 2017, ArXiv.

[30]  Susumu Goto,et al.  Data, information, knowledge and principle: back to metabolism in KEGG , 2013, Nucleic Acids Res..