Regularising Non-linear Models Using Feature Side-information
暂无分享,去创建一个
[1] Inderjit S. Dhillon,et al. Matrix Completion with Noisy Side Information , 2015, NIPS.
[2] kPT xiy,et al. Robust Principal Component Analysis with Side Information , 2016 .
[3] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[4] Pradeep Ravikumar,et al. Collaborative Filtering with Graph Information: Consistency and Scalable Methods , 2015, NIPS.
[5] Adrian Corduneanu,et al. On Information Regularization , 2002, UAI.
[6] Yann LeCun,et al. Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.
[7] Bernhard Schölkopf,et al. Training Invariant Support Vector Machines , 2002, Machine Learning.
[8] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[9] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[11] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[12] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[13] Yang Song,et al. Improving the Robustness of Deep Neural Networks via Stability Training , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Lorenzo Rosasco,et al. Nonparametric sparsity and regularization , 2012, J. Mach. Learn. Res..
[15] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..
[16] Michael I. Jordan,et al. Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.
[17] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.
[18] Pascal Vincent,et al. Higher Order Contractive Auto-Encoder , 2011, ECML/PKDD.
[19] Rauf Izmailov,et al. Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..
[20] Jian Huang,et al. The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression. , 2011, Annals of statistics.
[21] SaltonGerard,et al. Term-weighting approaches in automatic text retrieval , 1988 .
[22] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.
[23] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[24] Chris Bishop,et al. Exact Calculation of the Hessian Matrix for the Multilayer Perceptron , 1992, Neural Computation.
[25] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[26] Zhongfei Zhang,et al. Manifold Regularized Discriminative Neural Networks , 2015, ArXiv.
[27] R. Tibshirani,et al. Sparsity and smoothness via the fused lasso , 2005 .
[28] Naftali Tishby,et al. Incorporating Prior Knowledge on Features into Learning , 2007, AISTATS.
[29] Naftali Tishby,et al. Learning to Select Features using their Properties , 2008 .