Sparse Graph Regularization Non-Negative Matrix Factorization Based on Huber Loss Model for Cancer Data Analysis

Non-negative matrix factorization (NMF) is a matrix decomposition method based on the square loss function. To exploit cancer information, cancer gene expression data often uses the NMF method to reduce dimensionality. Gene expression data usually have some noise and outliers, while the original NMF loss function is very sensitive to non-Gaussian noise. To improve the robustness and clustering performance of the algorithm, we propose a sparse graph regularization NMF based on Huber loss model for cancer data analysis (Huber-SGNMF). Huber loss is a function between L 1-norm and L 2-norm that can effectively handle non-Gaussian noise and outliers. Taking into account the sparsity matrix and data geometry information, sparse penalty and graph regularization terms are introduced into the model to enhance matrix sparsity and capture data manifold structure. Before the experiment, we first analyzed the robustness of Huber-SGNMF and other models. Experiments on The Cancer Genome Atlas (TCGA) data have shown that Huber-SGNMF performs better than other most advanced methods in sample clustering and differentially expressed gene selection.

[1]  Wen Gao,et al.  Progressive Image Denoising Through Hybrid Graph Laplacian Regularization: A Unified Framework , 2014, IEEE Transactions on Image Processing.

[2]  Houyuan Jiang,et al.  Semismooth Karush-Kuhn-Tucker Equations and Convergence Analysis of Newton and Quasi-Newton Methods for Solving these Equations , 1997, Math. Oper. Res..

[3]  Jordi Vitrià,et al.  Non-negative Matrix Factorization for Face Recognition , 2002, CCIA.

[4]  Antoine Abche,et al.  Informed Split Gradient Non-negative Matrix factorization using Huber cost function for source apportionment , 2016, 2016 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[5]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[6]  Chris H. Q. Ding,et al.  Robust nonnegative matrix factorization using L21-norm , 2011, CIKM '11.

[7]  Shuiliang Wang,et al.  Valproic acid exhibits anti-tumor activity selectively against EGFR/ErbB2/ErbB3-coexpressing pancreatic cancer via induction of ErbB family members-targeting microRNAs , 2019, Journal of Experimental & Clinical Cancer Research.

[8]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[9]  Guoqiang Han,et al.  Identification of Multidimensional Regulatory Modules Through Multi-Graph Matching With Network Constraints , 2020, IEEE Transactions on Biomedical Engineering.

[10]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[11]  Yong Xu,et al.  Supervised Discriminative Sparse PCA for Com-Characteristic Gene Selection and Tumor Classification on Multiview Biological Data , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[12]  T. Ushimaru,et al.  Cdh1 degradation is mediated by APC/C-Cdh1 and SCF-Cdc4 in budding yeast. , 2018, Biochemical and biophysical research communications.

[13]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[14]  Y. Matsuno,et al.  CTNNB1 mutational analysis of solid-pseudopapillary neoplasms of the pancreas using endoscopic ultrasound-guided fine-needle aspiration and next-generation deep sequencing , 2015, Journal of Gastroenterology.

[15]  Richard G. Fowler,et al.  DNA Sequencing of Small Bowel Adenocarcinomas Identifies Targetable Recurrent Mutations in the ERBB2 Signaling Pathway , 2018, Clinical Cancer Research.

[16]  Yuanyuan Wang,et al.  Correntropy induced metric based graph regularized non-negative matrix factorization , 2014, Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics (SPAC).

[17]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[18]  Funda Meric-Bernstam,et al.  Advances in HER2-Targeted Therapy: Novel Agents and Opportunities Beyond Breast and Gastric Cancer , 2018, Clinical Cancer Research.

[19]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[20]  T. Carey,et al.  Rationale for Using Irreversible Epidermal Growth Factor Receptor Inhibitors in Combination with Phosphatidylinositol 3-Kinase Inhibitors for Advanced Head and Neck Squamous Cell Carcinoma , 2019, Molecular Pharmacology.

[21]  Yong-Qiang Wang,et al.  Promoter methylation and expression of CDH1 and susceptibility and prognosis of eyelid squamous cell carcinoma , 2016, Tumor Biology.

[22]  Xuan Li,et al.  Robust Nonnegative Matrix Factorization via Half-Quadratic Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[23]  B. Wang,et al.  The association between CDH1 promoter methylation and patients with ovarian cancer: a systematic meta-analysis , 2016, Journal of Ovarian Research.

[24]  Guoqiang Han,et al.  HOGMMNC: a higher order graph matching with multiple network constraints model for gene‐drug regulatory modules identification , 2018, Bioinform..

[25]  Hyunsoo Kim,et al.  Sparse Non-negative Matrix Factorizations via Alternating Non-negativity-constrained Least Squares , 2006 .

[26]  Yong Xu,et al.  Characteristic Gene Selection Based on Robust Graph Regularized Non-Negative Matrix Factorization , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Weifeng Liu,et al.  Correntropy: Properties and Applications in Non-Gaussian Signal Processing , 2007, IEEE Transactions on Signal Processing.

[28]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Xuelong Li,et al.  Graph Regularized Non-Negative Low-Rank Matrix Factorization for Image Clustering , 2017, IEEE Transactions on Cybernetics.

[30]  Cai-Jie Shen,et al.  A novel approach to select differential pathways associated with hypertrophic cardiomyopathy based on gene co-expression analysis , 2017, Molecular medicine reports.

[31]  A. Avan,et al.  Therapeutic potential of targeting the Wnt/β‐catenin pathway in the treatment of pancreatic cancer , 2018, Journal of cellular biochemistry.

[32]  Z. Ronai,et al.  The Anaphase-Promoting Complex or Cyclosome Supports Cell Survival in Response to Endoplasmic Reticulum Stress , 2012, PloS one.

[33]  F. Liu,et al.  On epicardial potential reconstruction using regularization schemes with the L1-norm data term , 2011, Physics in medicine and biology.

[34]  Raymond H. Chan,et al.  The Equivalence of Half-Quadratic Minimization and the Gradient Linearization Iteration , 2007, IEEE Transactions on Image Processing.

[35]  Jiazhou Chen,et al.  Simultaneous Interrogation of Cancer Omics to Identify Subtypes With Significant Clinical Differences , 2019, Front. Genet..

[36]  Mikkel N. Schmidt,et al.  Single-channel speech separation using sparse non-negative matrix factorization , 2006, INTERSPEECH.

[37]  Yong Xu,et al.  Robust PCA based method for discovering differentially expressed genes , 2013, BMC Bioinformatics.

[38]  Jiguo Yu,et al.  Regularized Non-Negative Matrix Factorization for Identifying Differentially Expressed Genes and Clustering Samples: A Survey , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[39]  MengChu Zhou,et al.  An Efficient Non-Negative Matrix-Factorization-Based Approach to Collaborative Filtering for Recommender Systems , 2014, IEEE Transactions on Industrial Informatics.

[40]  Chris H. Q. Ding,et al.  R1-PCA: rotational invariant L1-norm principal component analysis for robust subspace factorization , 2006, ICML.

[41]  Juan Wang,et al.  Graph regularized robust non-negative matrix factorization for clustering and selecting differentially expressed genes , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[42]  Christoph H Borchers,et al.  Inhibition of APCCdh1 Activity by Cdh1/Acm1/Bmh1 Ternary Complex Formation* , 2007, Journal of Biological Chemistry.

[43]  Yan Zhang,et al.  Robust L1-norm matrixed locality preserving projection for discriminative subspace learning , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[44]  M. Bissonnette,et al.  Upregulation of glycogen synthase kinase 3β in human colorectal adenocarcinomas correlates with accumulation of CTNNB1. , 2011, Clinical colorectal cancer.

[45]  Hiromu Suzuki,et al.  Spontaneous rupture of an advanced pancreatoblastoma: aberrant RASSF1A methylation and CTNNB1 mutation as molecular genetic markers. , 2013, Journal of pediatric surgery.

[46]  Thomas S. Huang,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation. , 2011, IEEE transactions on pattern analysis and machine intelligence.