Sparse regularized low-rank tensor regression with applications in genomic data analysis

Abstract Many applications in biomedical informatics deal with data in the tensor form. Traditional regression methods which take vectors as covariates may encounter difficulties in handling tensors due to their ultrahigh dimensionality and complex structure. In this paper, we introduce a novel sparse regularized Tucker tensor regression model to exploit the structure of tensor covariates and perform feature selection on tensor data. Based on Tucker decomposition of the regression coefficient tensor, we reduce the ultrahigh dimensionality to a manageable level. To make our model identifiable, we impose the orthonormality constraint on the factor matrices. Unlike previous tensor regression models that impose sparse penalty on the factor matrices of the coefficient tensor, our model directly imposes sparse penalty on the coefficient tensor to select the relevant features on tensor data. An efficient optimization algorithm based on alternating direction method of multiplier (ADMM) algorithm is designed to solve our proposed model. The performance of our model is evaluated on both synthetic and real genomic data. Experiment results on synthetic data demonstrate that our model could identify the true related signals more accurately than other state-of-the-art regression models. The analysis on genomic data of melanoma demonstrates that our model can achieve better prediction performance and identify markers with important implications. Our model and the associated studies can provide useful insights to disease or pathogenesis mechanisms, and will benefit further studies in variable selection.

[1]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[2]  Junbin Gao,et al.  Tensor Regression Based on Linked Multiway Parameter Analysis , 2014, 2014 IEEE International Conference on Data Mining.

[3]  David B. Dunson,et al.  Bayesian Tensor Regression , 2015, J. Mach. Learn. Res..

[4]  Hongtu Zhu,et al.  Tensor Regression with Applications in Neuroimaging Data Analysis , 2012, Journal of the American Statistical Association.

[5]  Sean D. Taverna,et al.  Selective Inhibition of p300 HAT Blocks Cell Cycle Progression, Induces Cellular Senescence and Inhibits the DNA Damage Response in Melanoma Cells , 2013, The Journal of investigative dermatology.

[6]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[7]  Yuanyuan Liu,et al.  Generalized Higher-Order Tensor Decomposition via Parallel ADMM , 2014, AAAI.

[8]  J. Horowitz,et al.  Asymptotic properties of bridge estimators in sparse high-dimensional regression models , 2008, 0804.0693.

[9]  Yuan Yan Tang,et al.  Joint sparse matrix regression and nonnegative spectral analysis for two-dimensional unsupervised feature selection , 2019, Pattern Recognit..

[10]  Benjamin J. Raphael,et al.  Integrated Genomic Analyses of Ovarian Carcinoma , 2011, Nature.

[11]  C. Berking,et al.  Prognostic significance of BRAF and NRAS mutations in melanoma: a German study from routine care , 2017, BMC Cancer.

[12]  J. Downward Targeting RAS signalling pathways in cancer therapy , 2003, Nature Reviews Cancer.

[13]  Edwin R. Hancock,et al.  Joint hypergraph learning and sparse regression for feature selection , 2017, Pattern Recognit..

[14]  Paul M. Thompson,et al.  Generalized reduced rank latent factor regression for high dimensional tensor fields, and neuroimaging-genetic applications , 2017, NeuroImage.

[15]  Jieping Ye,et al.  Sparse non-negative tensor factorization using columnwise coordinate descent , 2012, Pattern Recognit..

[16]  Y. Ahn,et al.  The role of CREB3L4 in the proliferation of prostate cancer cells , 2017, Scientific Reports.

[17]  M. Ding,et al.  The role of glycogen synthase kinase 3beta in the transformation of epidermal cells. , 2007, Cancer research.

[18]  R. Moon,et al.  WNT signalling pathways as therapeutic targets in cancer , 2012, Nature Reviews Cancer.

[19]  B. Reva,et al.  MAP2K1 (MEK1) Mutations Define a Distinct Subset of Lung Adenocarcinoma Associated with Smoking , 2014, Clinical Cancer Research.

[20]  E. Simpson,et al.  Frequent somatic mutations of GNAQ in uveal melanoma and blue nevi , 2008, Nature.

[21]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[22]  Dao-Qing Dai,et al.  Regularized logistic regression with network-based pairwise interaction for biomarker identification in breast cancer , 2016, BMC Bioinformatics.

[23]  Weiwei Guo,et al.  Tensor Learning for Regression , 2012, IEEE Transactions on Image Processing.

[24]  G. Ghanem,et al.  TYRP1 mRNA expression in melanoma metastases correlates with clinical outcome , 2011, British Journal of Cancer.

[25]  Dinggang Shen,et al.  Structured sparsity regularized multiple kernel learning for Alzheimer's disease diagnosis , 2019, Pattern Recognit..

[26]  M. Czyz,et al.  MITF in melanoma: mechanisms behind its expression and activity , 2014, Cellular and Molecular Life Sciences.

[27]  D. Brat,et al.  A recurrent kinase domain mutation in PRKCA defines chordoid glioma of the third ventricle , 2018, Nature Communications.

[28]  Xiaoshan Li,et al.  Tucker Tensor Regression and Neuroimaging Analysis , 2018, Statistics in Biosciences.

[29]  Zhixun Su,et al.  Linearized Alternating Direction Method with Adaptive Penalty for Low-Rank Representation , 2011, NIPS.

[30]  J. Eberle,et al.  Downregulation of endothelin B receptor in human melanoma cell lines parallel to differentiation genes. , 1999, The Journal of investigative dermatology.

[31]  M. Herlyn,et al.  GSK3β inhibition blocks melanoma cell/host interactions by downregulating N-cadherin expression and decreasing FAK phosphorylation , 2012, The Journal of investigative dermatology.

[32]  A. Hölscher,et al.  MUC1 and Nuclear β-Catenin Are Coexpressed at the Invasion Front of Colorectal Carcinomas and Are Both Correlated with Tumor Prognosis , 2004, Clinical Cancer Research.

[33]  Hong Cheng,et al.  Generalized Higher-Order Orthogonal Iteration for Tensor Decomposition and Completion , 2014, NIPS.

[34]  Pheng-Ann Heng,et al.  Sparse Support Matrix Machine , 2018, Pattern Recognit..

[35]  A Breslow,et al.  Thickness, Cross‐Sectional Areas and Depth of Invasion in the Prognosis of Cutaneous Melanoma , 1970, Annals of surgery.

[36]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[37]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[38]  Joos Vandewalle,et al.  On the Best Rank-1 and Rank-(R1 , R2, ... , RN) Approximation of Higher-Order Tensors , 2000, SIAM J. Matrix Anal. Appl..

[39]  Carlos Silvestre,et al.  Uncertainty characterization of the orthogonal Procrustes problem with arbitrary covariance matrices , 2017, Pattern Recognit..

[40]  Nikos D. Sidiropoulos,et al.  Parallel Algorithms for Constrained Tensor Factorization via Alternating Direction Method of Multipliers , 2014, IEEE Transactions on Signal Processing.

[41]  M. Shamsollahi,et al.  Higher order spectral regression discriminant analysis (HOSRDA): A tensor feature reduction method for ERP detection , 2017, Pattern Recognit..

[42]  Nikhil Wagle,et al.  Dissecting therapeutic resistance to RAF inhibition in melanoma by tumor genomic profiling. , 2011, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[43]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[44]  Tong Li,et al.  CREBRF is a potent tumor suppressor of glioblastoma by blocking hypoxia-induced autophagy via the CREB3/ATG5 pathway. , 2016, International journal of oncology.

[45]  Rose Yu,et al.  Learning from Multiway Data: Simple and Efficient Tensor Regression , 2016, ICML.