Rotation Forest for multi-target regression

The prediction of multiple numeric outputs at the same time is called multi-target regression (MTR), and it has gained attention during the last decades. This task is a challenging research topic in supervised learning because it poses additional difficulties to traditional single-target regression (STR), and many real-world problems involve the prediction of multiple targets at once. One of the most successful approaches to deal with MTR, although not the only one, consists in transforming the problem in several STR problems, whose outputs will be combined building up the MTR output. In this paper, the Rotation Forest ensemble method, previously proposed for single-label classification and single-target regression, is adapted to MTR tasks and tested with several regressors and data sets. Our proposal rotates the input space in an efficient and novel fashion, avoiding extra rotations forced by MTR problem decomposition. Four approaches for MTR are used: single-target (ST), stacked-single target (SST), Ensembles of Regressor Chains (ERC), and Multi-target Regression via Quantization (MRQ). For assessing the benefits of the proposal, a thorough experimentation with 28 MTR data sets and statistical tests are used, concluding that Rotation Forest, adapted by means of these approaches, outperforms other popular ensembles, such as Bagging and Random Forest.

[1]  Xin Deng,et al.  Multi-target regression via target specific features , 2019, Knowl. Based Syst..

[2]  Saso Dzeroski,et al.  Tree ensembles for predicting structured outputs , 2013, Pattern Recognit..

[3]  Peter Kokol,et al.  Rotation of random forests for genomic and proteomic classification problems. , 2011, Advances in experimental medicine and biology.

[4]  J. Zidek,et al.  Multivariate regression analysis and canonical variates , 1980 .

[5]  Habib Fardoun,et al.  An ensemble-based method for the selection of instances in the multi-target regression problem , 2018, Integr. Comput. Aided Eng..

[6]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[7]  Grigorios Tsoumakas,et al.  Multi-target Regression via Random Linear Target Combinations , 2014, ECML/PKDD.

[8]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[9]  Davor Antanasijević,et al.  Virtual water quality monitoring at inactive monitoring sites using Monte Carlo optimized artificial neural networks: A case study of Danube River (Serbia). , 2019, The Science of the total environment.

[10]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[11]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Yong Zhou,et al.  An improved efficient rotation forest algorithm to predict the interactions among proteins , 2018, Soft Comput..

[14]  Esra Adiyeke,et al.  The benefits of target relations: A comparison of multitask extensions and classifier chains , 2020, Pattern Recognit..

[15]  Saso Dzeroski,et al.  Constraint Based Induction of Multi-objective Regression Trees , 2005, KDID.

[16]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[17]  William Zhu,et al.  Multi-label feature selection via feature manifold learning and sparsity regularization , 2018, Int. J. Mach. Learn. Cybern..

[18]  Francesca Mangili,et al.  Should We Really Use Post-Hoc Tests Based on Mean-Ranks? , 2015, J. Mach. Learn. Res..

[19]  Lin Li,et al.  Multi-output least-squares support vector regression machines , 2013, Pattern Recognit. Lett..

[20]  Guo-Zheng Li,et al.  A novel multi-target regression framework for time-series prediction of drug efficacy , 2017, Scientific Reports.

[21]  B. Pham,et al.  Rotation forest fuzzy rule-based classifier ensemble for spatial prediction of landslides using GIS , 2016, Natural Hazards.

[22]  Piotr Fryzlewicz,et al.  Random Rotation Ensembles , 2016, J. Mach. Learn. Res..

[23]  R. Polikar,et al.  Ensemble based systems in decision making , 2006, IEEE Circuits and Systems Magazine.

[24]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[25]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[26]  Ivan Bratko,et al.  First Order Regression , 1997, Machine Learning.

[27]  Chi-Hyuck Jun,et al.  Regularization-based model tree for multi-output regression , 2020, Inf. Sci..

[28]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[29]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Saso Dzeroski,et al.  Ensembles for multi-target regression with random output selections , 2018, Machine Learning.

[31]  Juan José Rodríguez Diez,et al.  Rotation Forests for regression , 2013, Appl. Math. Comput..

[32]  Chun-Xia Zhang,et al.  RotBoost: A technique for combining Rotation Forest and AdaBoost , 2008, Pattern Recognit. Lett..

[33]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[34]  Tapio Elomaa,et al.  Multi-target regression with rule ensembles , 2012, J. Mach. Learn. Res..

[35]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[36]  Nicolás García-Pedrajas,et al.  Supervised subspace projections for constructing ensembles of classifiers , 2012, Inf. Sci..

[37]  Jaime S. Cardoso,et al.  A Regression Model for Predicting Shape Deformation after Breast Conserving Surgery , 2018, Sensors.

[38]  Saso Dzeroski,et al.  Ensembles of Multi-Objective Decision Trees , 2007, ECML.

[39]  Grigorios Tsoumakas,et al.  An empirical study on sea water quality prediction , 2008, Knowl. Based Syst..

[40]  Álvar Arnaiz-González,et al.  Evolutionary prototype selection for multi-output regression , 2019, Neurocomputing.

[41]  Grigorios Tsoumakas,et al.  MULAN: A Java Library for Multi-Label Learning , 2011, J. Mach. Learn. Res..

[42]  Pang-Ning Tan,et al.  Position Preserving Multi-Output Prediction , 2013, ECML/PKDD.

[43]  Xinqi Zhu,et al.  An efficient gradient-based model selection algorithm for multi-output least-squares support vector regression machines , 2018, Pattern Recognit. Lett..

[44]  Fernando Pérez-Cruz,et al.  SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems , 2004, IEEE Transactions on Signal Processing.

[45]  Timothy C. Coburn,et al.  Geostatistics for Natural Resources Evaluation , 2000, Technometrics.

[46]  A. Izenman Reduced-rank regression for the multivariate linear model , 1975 .

[47]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[48]  Manuel Graña,et al.  Hybrid extreme rotation forest , 2014, Neural Networks.

[49]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[50]  Francisco Charte,et al.  Multilabel Classification , 2016, Springer International Publishing.

[51]  Zhang Xiong,et al.  Dimensionality Reduction in Multiple Ordinal Regression , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[52]  Saso Dzeroski,et al.  Stepwise Induction of Multi-target Model Trees , 2007, ECML.

[53]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[54]  Sylvio Barbon Junior,et al.  Predicting poultry meat characteristics using an enhanced multi-target regression method , 2018, Biosystems Engineering.

[55]  Vojislav Kecman,et al.  Multi-target support vector regression via correlation regressor chains , 2017, Inf. Sci..

[56]  Enrico Gerding,et al.  A comparison of multitask and single task learning with artificial neural networks for yield curve forecasting , 2019, Expert Syst. Appl..

[57]  Dragi Kocev,et al.  Feature ranking for multi-target regression , 2019, Machine Learning.

[58]  G. De’ath MULTIVARIATE REGRESSION TREES: A NEW TECHNIQUE FOR MODELING SPECIES–ENVIRONMENT RELATIONSHIPS , 2002 .

[59]  I-Cheng Yeh,et al.  Modeling slump flow of concrete using second-order regressions and artificial neural networks , 2007 .

[60]  Filiberto Pla,et al.  Filter-Type Variable Selection Based on Information Measures for Regression Tasks , 2012, Entropy.

[61]  Xiaofei He,et al.  Multi-Target Regression via Robust Low-Rank Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[63]  Victor Guilherme Turrisi da Costa,et al.  Multi-Output Tree Chaining: An Interpretative Modelling and Lightweight Multi-Target Approach , 2018, Journal of Signal Processing Systems.

[64]  Saso Dzeroski,et al.  Predicting Chemical Parameters of River Water Quality from Bioindicator Data , 2000, Applied Intelligence.

[65]  E. Walter,et al.  Multi-Output Suppport Vector Regression , 2003 .

[66]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[67]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[68]  Ben Taskar,et al.  Joint covariate selection and joint subspace selection for multiple classification problems , 2010, Stat. Comput..