Enhancing Graph-Based Semisupervised Learning via Knowledge-Aware Data Embedding

Semisupervised learning (SSL) is a family of classification methods conceived to reduce the amount of required labeled information in the training phase. Graph-based methods are among the most popular semisupervised strategies: the nearest neighbor graph is built in such a way that the manifold of the data is captured and the labeled information is propagated to target samples along the structure of the manifold. Research in graph-based SSL has mainly focused on two aspects: 1) the construction of the $k$ -nearest neighbors graph and/or 2) the propagation algorithm providing the classification. Differently from the previous literature, in this article, we focus on the data representation with the aim of incorporating semisupervision earlier in the process. To this end, we propose an algorithm that learns a new knowledge-aware data embedding via an ensemble of semisupervised autoencoders to enhance a graph-based semisupervised classification. The experiments carried out on different classification tasks demonstrate the benefit of our approach.

[1]  Christos Faloutsos,et al.  CAMLP: Confidence-Aware Modulated Label Propagation , 2016, SDM.

[2]  Ruggero G. Pensa,et al.  Semi-Supervised Clustering With Multiresolution Autoencoders , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[3]  Charu C. Aggarwal,et al.  Outlier Detection with Autoencoder Ensembles , 2017, SDM.

[4]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[5]  Ren Haijun,et al.  Semi-Supervised Autoencoder: A Joint Approach of Representation and Classification , 2015, 2015 International Conference on Computational Intelligence and Communication Networks (CICN).

[6]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[7]  Christos Faloutsos,et al.  SocNL: Bayesian Label Propagation with Confidence , 2015, PAKDD.

[8]  Shih-Fu Chang,et al.  Learning with Partially Absorbing Random Walks , 2012, NIPS.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Ikumi Suzuki,et al.  Centered kNN Graph for Semi-Supervised Learning , 2017, SIGIR.

[11]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[12]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[13]  Anderson Rocha,et al.  Label propagation through neuronal synchrony , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[14]  Koby Crammer,et al.  New Regularized Algorithms for Transductive Learning , 2009, ECML/PKDD.

[15]  Angshul Majumdar,et al.  Semi Supervised Autoencoder , 2016, ICONIP.

[16]  Dino Ienco,et al.  DuPLO: A DUal view Point deep Learning architecture for time series classificatiOn , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[17]  Joydeep Ghosh,et al.  Investigation of the random forest framework for classification of hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[18]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[19]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[20]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[21]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.