Universum Prescription: Regularization Using Unlabeled Data
暂无分享,去创建一个
[1] Yoshua Bengio,et al. Scaling learning algorithms towards AI , 2007 .
[2] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[3] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[4] Jason Weston,et al. Inference with the Universum , 2006, ICML.
[5] Xiaojin Zhu,et al. Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[6] Rich Caruana,et al. Model compression , 2006, KDD '06.
[7] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..
[8] Thomas Hofmann,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2007 .
[9] Bernhard Schölkopf,et al. An Analysis of Inference with the Universum , 2007, NIPS.
[10] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[11] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[12] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[13] Rich Caruana,et al. Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.
[14] Benjamin Graham,et al. Spatially-sparse convolutional neural networks , 2014, ArXiv.
[15] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[16] Honglak Lee,et al. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.
[17] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[18] Yaser S. Abu-Mostafa,et al. Learning from hints in neural networks , 1990, J. Complex..
[19] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[20] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[21] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .
[22] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.
[23] Qiang Chen,et al. Network In Network , 2013, ICLR.
[24] Peter L. Bartlett,et al. Model Selection and Error Estimation , 2000, Machine Learning.
[25] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[26] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[27] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[28] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[29] Tapani Raiko,et al. Semi-supervised Learning with Ladder Networks , 2015, NIPS.
[30] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[31] Steven C. Suddarth,et al. Symbolic-Neural Systems and the Use of Hints for Developing Complex Systems , 1991, Int. J. Man Mach. Stud..
[32] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .
[33] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[34] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[35] Alexander Zien,et al. Data-Dependent Regularization , 2006 .
[36] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.
[37] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[38] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[39] Zhi-Hua Zhou,et al. Exploiting unlabeled data to enhance ensemble diversity , 2009, 2010 IEEE International Conference on Data Mining.
[40] Yann LeCun,et al. Stacked What-Where Auto-encoders , 2015, ArXiv.
[41] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[43] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[44] Davide Anguita,et al. The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers , 2011, NIPS.
[45] Rob Fergus,et al. Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.
[46] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[47] P. Massart. Some applications of concentration inequalities to statistics , 2000 .
[48] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .
[49] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[50] R. Dudley. The Sizes of Compact Subsets of Hilbert Space and Continuity of Gaussian Processes , 1967 .
[51] Bernhard Schölkopf,et al. A Discussion of Semi-Supervised Learning and Transduction , 2006, Semi-Supervised Learning.
[52] R. Serfling. Probability Inequalities for the Sum in Sampling without Replacement , 1974 .
[53] Alexander Gammerman,et al. Learning by Transduction , 1998, UAI.
[54] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[55] Xianguo Zhang. PAC-Learning for Energy-based Models , 2022 .
[56] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[57] Bernhard Schölkopf,et al. Semi-Supervised Learning (Adaptive Computation and Machine Learning) , 2006 .
[58] Davide Anguita,et al. Local Rademacher Complexity: Sharper risk bounds with and without unlabeled samples , 2015, Neural Networks.
[59] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.
[60] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[61] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[62] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .
[63] V. Koltchinskii,et al. Rademacher Processes and Bounding the Risk of Function Learning , 2004, math/0405338.
[64] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.