Statistical-Mechanical Analysis of Semi-Supervised Learning and Its Optimal Scheduling

Semi-supervised learning is a paradigm that uses a large number of unlabeled data and a small number of labeled data. We analyze the dynamical behaviors of semi-supervised learning in the framework of on-line learning by the statistical-mechanical method. A student uses several correlated input vectors in each update. The student is given a desired output for only one input vector out of these correlated input vectors. In this model, we derive simultaneous differential equations with deterministic forms that describe the dynamical behaviors of order parameters using the self-averaging property in the thermodynamic limit. We treat three well-known learning rules, that is, the Hebbian, Perceptron, and AdaTron learning rules. As a result, it is shown that using unlabeled data is effective in the early stages for all three learning rules. In addition, we show that the three learning rules have qualitatively different dynamical behaviors. Furthermore, we propose a new algorithm that improves the generalization...