论文信息 - Distributionally Robust Semi-supervised Learning - 字舞流文

Distributionally Robust Semi-supervised Learning

We propose a novel method for semi-supervised learning based on data-driven distributionally robust optimization (DRO) using optimal transport metrics. Our proposed method enhances generalization error by using the non-labeled data to restrict the support of the worst case distribution in our DRO formulation. We enable the implementation of the DRO formulation by proposing a stochastic gradient descent algorithm which allows to easily implement the training procedure. We demonstrate the improvement in generalization error in semi-supervised extensions of regularized logistic regression and square-root LASSO. Finally, we include a discussion on the large sample behavior of the optimal uncertainty region in the DRO formulation. Our discussion exposes important aspects such as the role of dimension reduction in semi-supervised learning.

Yang Kang | Jose Blanchet | J. Blanchet | Yang Kang

[1] Daniel Kuhn,et al. Distributionally Robust Logistic Regression , 2015, NIPS.

[2] Ronald Rosenfeld,et al. Semi-supervised learning with graphs , 2005 .

[3] Peter W. Glynn,et al. Unbiased Estimation with Square Root Convergence for SDE Models , 2015, Oper. Res..

[4] Karthyek R. A. Murthy,et al. Quantifying Distributional Model Risk Via Optimal Transport , 2016, Math. Oper. Res..

[5] M. KarthyekRajhaaA.,et al. Robust Wasserstein profile inference and applications to machine learning , 2019, J. Appl. Probab..

[6] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[7] Don McLeish,et al. A general method for debiasing a Monte Carlo estimator , 2010, Monte Carlo Methods Appl..

[8] Vishal Gupta,et al. Data-driven robust optimization , 2013, Math. Program..

[9] Yang Kang,et al. Sample Out-of-Sample Inference Based on Wasserstein Distance , 2016, Oper. Res..

[10] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[11] Constantine Caramanis,et al. Theory and Applications of Robust Optimization , 2010, SIAM Rev..

[12] Shie Mannor,et al. Robust Regression and Lasso , 2008, IEEE Transactions on Information Theory.

[13] Peter W. Glynn,et al. Unbiased Monte Carlo for optimization and functions of expectations via multi-level randomization , 2015, 2015 Winter Simulation Conference (WSC).

[14] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[15] Yuanqing Li,et al. A self-training semi-supervised SVM algorithm and its application in an EEG-based brain computer interface speller system , 2008, Pattern Recognit. Lett..

[16] Michael B. Giles,et al. Multilevel Monte Carlo Path Simulation , 2008, Oper. Res..

[17] Angelia Nedic,et al. Distributed Stochastic Subgradient Projection Algorithms for Convex Optimization , 2008, J. Optim. Theory Appl..

[18] Michael B. Giles,et al. Multilevel Monte Carlo methods , 2013, Acta Numerica.

[19] Alexander Zien,et al. Semi-Supervised Learning , 2006 .