Semi-supervised trees for multi-target regression

The predictive performance of traditional supervised methods heavily depends on the amount of labeled data. However, obtaining labels is a difficult process in many real-life tasks, and only a small amount of labeled data is typically available for model learning. As an answer to this problem, the concept of semi-supervised learning has emerged. Semi-supervised methods use unlabeled data in addition to labeled data to improve the performance of supervised methods.It is even more difficult to get labeled data for data mining problems with structured outputs since several labels need to be determined for each example. Multi-target regression (MTR) is one type of a structured output prediction problem, where we need to simultaneously predict multiple continuous variables. Despite the apparent need for semi-supervised methods able to deal with MTR, only a few such methods are available and even those are difficult to use in practice and/or their advantages over supervised methods for MTR are not clear.This paper presents an extension of predictive clustering trees for MTR and ensembles thereof towards semi-supervised learning. The proposed method preserves the appealing characteristic of decision trees while enabling the use of unlabeled examples. In particular, the proposed semi-supervised trees for MTR are interpretable, easy to understand, fast to learn, and can handle both numeric and nominal descriptive features. We perform an extensive empirical evaluation in both an inductive and a transductive semi-supervised setting. The results show that the proposed method improves the performance of supervised predictive clustering trees and enhances their interpretability (due to reduced tree size), whereas, in the ensemble learning scenario, it outperforms its supervised counterpart in the transductive setting. The proposed methods have a mechanism for controlling the influence of unlabeled examples, which makes them highly useful in practice: This mechanism can protect them against a degradation of performance of their supervised counterparts an inherent risk of semi-supervised learning. The proposed methods also outperform two existing semi-supervised methods for MTR.

[1]  Grigorios Tsoumakas,et al.  Multi-target Regression via Random Linear Target Combinations , 2014, ECML/PKDD.

[2]  S. Džeroski,et al.  Using multi-objective classification to model communities of soil microarthropods , 2006 .

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Saso Dzeroski,et al.  Tree ensembles for predicting structured outputs , 2013, Pattern Recognit..

[5]  Xin Du Semi-supervised learning of local structured output predictors , 2017, Neurocomputing.

[6]  Saso Dzeroski,et al.  Predicting Structured Outputs k-Nearest Neighbours Method , 2011, Discovery Science.

[7]  Michelangelo Ceci,et al.  Self-training for multi-target regression with tree ensembles , 2017, Knowl. Based Syst..

[8]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[9]  Ming-Wei Chang,et al.  Load Forecasting Using Support Vector Machines: A Study on EUNITE Competition 2001 , 2004, IEEE Transactions on Power Systems.

[10]  Saso Dzeroski,et al.  The importance of the label hierarchy in hierarchical multi-label classification , 2015, Journal of Intelligent Information Systems.

[11]  Horst Bischof,et al.  Semi-Supervised Random Forests , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Saso Dzeroski,et al.  Estimating vegetation height and canopy cover from remotely sensed data with machine learning , 2010, Ecol. Informatics.

[13]  Saso Dzeroski,et al.  Stepwise Induction of Multi-target Model Trees , 2007, ECML.

[14]  Hans-Peter Kriegel,et al.  Future trends in data mining , 2007, Data Mining and Knowledge Discovery.

[15]  Michelangelo Ceci,et al.  Semi-supervised classification trees , 2017, Journal of Intelligent Information Systems.

[16]  Dit-Yan Yeung,et al.  Semi-Supervised Multi-Task Regression , 2009, ECML/PKDD.

[17]  Concha Bielza,et al.  A survey on multi‐output regression , 2015, WIREs Data Mining Knowl. Discov..

[18]  S. Džeroski,et al.  Using single- and multi-target regression trees and ensembles to model a compound index of vegetation condition , 2009 .

[19]  Thomas G. Dietterich,et al.  Structured machine learning: the next ten years , 2008, Machine Learning.

[20]  Sarah Jane Delany k-Nearest Neighbour Classifiers , 2007 .

[21]  Fabio Gagliardi Cozman,et al.  Unlabeled Data Can Degrade Classification Performance of Generative Classifiers , 2002, FLAIRS.

[22]  Harry Zhang,et al.  An Extensive Empirical Study on Semi-supervised Learning , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Florence d'Alché-Buc,et al.  Input Output Kernel Regression: Supervised and Semi-Supervised Structured Output Prediction with Operator-Valued Kernels , 2016, J. Mach. Learn. Res..

[24]  Saso Dzeroski,et al.  Using Decision Trees to Predict Forest Stand Height and Canopy Cover from LANDSAT and LIDAR Data , 2006, EnviroInfo.

[25]  Nitesh V. Chawla,et al.  Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains , 2011, J. Artif. Intell. Res..

[26]  Saso Dzeroski,et al.  Simultaneous Prediction of Mulriple Chemical Parameters of River Water Quality with TILDE , 1999, PKDD.

[27]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[28]  Ian Witten,et al.  Data Mining , 2000 .

[29]  Saso Dzeroski,et al.  Fast and efficient visual codebook construction for multi-label annotation using predictive clustering trees , 2014, Pattern Recognit. Lett..

[30]  Lin Li,et al.  Multi-output least-squares support vector regression machines , 2013, Pattern Recognit. Lett..

[31]  Luc De Raedt,et al.  Top-Down Induction of Clustering Trees , 1998, ICML.

[32]  Saso Dzeroski,et al.  Constraint Based Induction of Multi-objective Regression Trees , 2005, KDID.

[33]  Samuel Kaski,et al.  Kernelized Bayesian Matrix Factorization , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[35]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[36]  Zhi-Hua Zhou,et al.  Semisupervised Regression with Cotraining-Style Algorithms , 2007, IEEE Transactions on Knowledge and Data Engineering.

[37]  Ying Liu,et al.  Real time prediction for converter gas tank levels based on multi-output least square support vector regressor , 2012 .

[38]  Mauricio A. Álvarez,et al.  Convolved Multi-output Gaussian Processes for Semi-Supervised Learning , 2015, ICIAP.

[39]  Athanasios Tsanas,et al.  Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools , 2012 .

[40]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[41]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[42]  Zhi-Hua Zhou,et al.  Towards Making Unlabeled Data Never Hurt , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Tapio Elomaa,et al.  Multi-target regression with rule ensembles , 2012, J. Mach. Learn. Res..

[44]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[45]  Grigorios Tsoumakas,et al.  Multi-target regression via input space expansion: treating targets as inputs , 2012, Machine Learning.

[46]  Andrew W. Fitzgibbon,et al.  The Joint Manifold Model for Semi-supervised Multi-valued Regression , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[47]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .