RMoR-Aion: Robust Multioutput Regression by Simultaneously Alleviating Input and Output Noises

Multioutput regression, referring to simultaneously predicting multiple continuous output variables with a single model, has drawn increasing attention in the machine learning community due to its strong ability to capture the correlations among multioutput variables. The methodology of output space embedding, built upon the low-rank assumption, is now the mainstream for multioutput regression since it can effectively reduce the parameter numbers while achieving effective performance. The existing low-rank methods, however, are sensitive to the noises of both inputs and outputs, referring to the noise problem. In this article, we develop a novel multioutput regression method by simultaneously alleviating input and output noises, namely, robust multioutput regression by alleviating input and output noises (RMoR-Aion), where both the noises of the input and output are exploited by leveraging auxiliary matrices. Furthermore, we propose a prediction output manifold constraint with the correlation information regarding the output variables to further reduce the adversarial effects of the noise. Our empirical studies demonstrate the effectiveness of RMoR-Aion compared with the state-of-the-art baseline methods, and RMoR-Aion is more stable in the settings with artificial noise.

[1]  Ivor W. Tsang,et al.  Survey on Multi-Output Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[2]  I-Cheng Yeh,et al.  Modeling slump flow of concrete using second-order regressions and artificial neural networks , 2007 .

[3]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[4]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[5]  Donato Malerba,et al.  Leveraging the power of local spatial autocorrelation in geophysical interpolative clustering , 2014, Data Mining and Knowledge Discovery.

[6]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[7]  Francesco Dinuzzo,et al.  Learning output kernels for multi-task problems , 2013, Neurocomputing.

[8]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[9]  Michelangelo Ceci,et al.  Semi-supervised Learning for Multi-target Regression , 2014, NFMCP.

[10]  Matthew West,et al.  Bayesian factor regression models in the''large p , 2003 .

[11]  S. Džeroski,et al.  Using multi-objective classification to model communities of soil microarthropods , 2006 .

[12]  Xiaofei He,et al.  Multi-Target Regression via Robust Low-Rank Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Prateek Jain,et al.  Sparse Local Embeddings for Extreme Multi-label Classification , 2015, NIPS.

[14]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[15]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[16]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[17]  Lin Wu,et al.  Robust Subspace Clustering for Multi-View Data by Exploiting Correlation Consensus , 2015, IEEE Transactions on Image Processing.

[18]  Dacheng Tao,et al.  Robust Extreme Multi-label Learning , 2016, KDD.

[19]  Maria L. Gini,et al.  Improving Prediction in TAC SCM by Integrating Multivariate and Temporal Aspects via PLS Regression , 2011, AMEC/TADA.

[20]  Saso Dzeroski,et al.  Stepwise Induction of Multi-target Model Trees , 2007, ECML.

[21]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[22]  Jihong Ouyang,et al.  Dataless Text Classification: A Topic Modeling Approach with Document Manifold , 2018, CIKM.

[23]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[24]  Yao Wang,et al.  Low-Rank Matrix Factorization under General Mixture Noise Distributions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Grigorios Tsoumakas,et al.  Multi-target Regression via Random Linear Target Combinations , 2014, ECML/PKDD.

[26]  Rama Chellappa,et al.  Growing Regression Forests by Classification: Applications to Object Pose Estimation , 2013, ECCV.

[27]  Vladimir Cherkassky,et al.  SVM+ regression and multi-task learning , 2009, 2009 International Joint Conference on Neural Networks.

[28]  Thomas Lengauer,et al.  Multi-task learning for HIV therapy screening , 2008, ICML '08.

[29]  Yoshua Bengio,et al.  Multi-Task Learning for Stock Selection , 1996, NIPS.

[30]  Pang-Ning Tan,et al.  Position Preserving Multi-Output Prediction , 2013, ECML/PKDD.

[31]  S. Džeroski,et al.  Using single- and multi-target regression trees and ensembles to model a compound index of vegetation condition , 2009 .

[32]  Lin Wu,et al.  Multiview Spectral Clustering via Structured Low-Rank Matrix Factorization , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Mark Brudnak Vector-Valued Support Vector Regression , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[34]  Tapio Elomaa,et al.  Multi-target regression with rule ensembles , 2012, J. Mach. Learn. Res..

[35]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[36]  Lin Wu,et al.  Iterative Views Agreement: An Iterative Low-Rank Based Structured Optimization Method to Multi-View Spectral Clustering , 2016, IJCAI.

[37]  Grigorios Tsoumakas,et al.  Random K-labelsets for Multilabel Classification , 2022 .

[38]  G. De’ath MULTIVARIATE REGRESSION TREES: A NEW TECHNIQUE FOR MODELING SPECIES–ENVIRONMENT RELATIONSHIPS , 2002 .

[39]  C. Ding,et al.  On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions , 2013, KDD.

[40]  Eric P. Xing,et al.  A multivariate regression approach to association analysis of a quantitative trait network , 2008, Bioinform..

[41]  Luis Alonso,et al.  Multioutput Support Vector Regression for Remote Sensing Biophysical Parameter Estimation , 2011, IEEE Geoscience and Remote Sensing Letters.

[42]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[43]  Yong Yu,et al.  Multi-output regression on the output manifold , 2009, Pattern Recognit..

[44]  Zenglin Xu,et al.  Learning With Incomplete Labels , 2018, AAAI.

[45]  Fernando Pérez-Cruz,et al.  SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems , 2004, IEEE Transactions on Signal Processing.

[46]  Xiantong Zhen,et al.  Multitarget Sparse Latent Regression , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[47]  Jihong Ouyang,et al.  Dirichlet Multinomial Mixture with Variational Manifold Regularization: Topic Modeling over Short Texts , 2019, AAAI.

[48]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[49]  Aleix M. Martínez,et al.  Multiobjective Optimization for Model Selection in Kernel Methods in Regression , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Saso Dzeroski,et al.  Incremental multi-target model trees for data streams , 2011, SAC.

[51]  Bo Zhang,et al.  Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization , 2017, IJCAI.

[52]  Lie Wang,et al.  Calibrated multivariate regression with application to neural semantic basis discovery , 2013, J. Mach. Learn. Res..

[53]  Leopold Parts,et al.  A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies , 2010, PLoS Comput. Biol..

[54]  Timo Similä,et al.  Input selection and shrinkage in multiresponse linear regression , 2007, Comput. Stat. Data Anal..

[55]  Saso Dzeroski,et al.  Tree ensembles for predicting structured outputs , 2013, Pattern Recognit..

[56]  J. Kasza,et al.  Interpretation of commonly used statistical regression models , 2014, Respirology.

[57]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[58]  Xin Geng,et al.  Multi-Label Manifold Learning , 2016, AAAI.

[59]  Ying Liu,et al.  Real time prediction for converter gas tank levels based on multi-output least square support vector regressor , 2012 .

[60]  Ivan Bratko,et al.  First Order Regression , 1997, Machine Learning.

[61]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[62]  Hal Daumé,et al.  Simultaneously Leveraging Output and Task Structures for Multiple-Output Regression , 2012, NIPS.

[63]  Gary Geunbae Lee,et al.  Multi-domain spoken language understanding with transfer learning , 2009, Speech Commun..

[64]  Neil D. Lawrence,et al.  Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies , 2012, PLoS Comput. Biol..

[65]  Saso Dzeroski,et al.  Constraint Based Induction of Multi-objective Regression Trees , 2005, KDID.

[66]  Lin Li,et al.  Multi-output least-squares support vector regression machines , 2013, Pattern Recognit. Lett..

[67]  Matti Pirinen,et al.  Multiple Output Regression with Latent Noise , 2014, J. Mach. Learn. Res..

[68]  Timothy C. Coburn,et al.  Geostatistics for Natural Resources Evaluation , 2000, Technometrics.

[69]  Qiang Yang,et al.  Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study , 2010, BMC Bioinformatics.

[70]  Lorenzo Rosasco,et al.  Multi-output learning via spectral filtering , 2012, Machine Learning.

[71]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[72]  Michelangelo Ceci,et al.  Network regression with predictive clustering trees , 2011, Data Mining and Knowledge Discovery.