Optimal Sampling of Parametric Families: Implications for Machine Learning
暂无分享,去创建一个
[1] L. Jones. Constructive approximations for neural networks by sigmoidal functions , 1990, Proc. IEEE.
[2] Rich Caruana,et al. Multitask Learning , 1997, Machine-mediated learning.
[3] Wayne A. Fuller,et al. Predictors for the first-order autoregressive process , 1980 .
[4] Neri Merhav,et al. A strong version of the redundancy-capacity theorem of universal coding , 1995, IEEE Trans. Inf. Theory.
[5] Jorma Rissanen,et al. Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.
[6] Benjamin Recht,et al. Do CIFAR-10 Classifiers Generalize to CIFAR-10? , 2018, ArXiv.
[7] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[8] W. Fuller,et al. Properties of Predictors for Autoregressive Time Series , 1981 .
[9] Jorma Rissanen,et al. Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.
[10] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[11] Abhisek Kundu,et al. Recovering PCA and Sparse PCA via Hybrid-(l1, l2) Sparse Sampling of Data Elements , 2017, J. Mach. Learn. Res..
[12] Vijay Balasubramanian,et al. A Geometric Formulation of Occam's Razor For Inference of Parametric Distributions , 1996, adap-org/9601001.
[13] Mehryar Mohri,et al. Learning Theory and Algorithms for Forecasting Non-stationary Time Series , 2015, NIPS.
[14] Neri Merhav,et al. Universal Prediction , 1998, IEEE Trans. Inf. Theory.
[15] A. Barron. THE STRONG ERGODIC THEOREM FOR DENSITIES: GENERALIZED SHANNON-MCMILLAN-BREIMAN THEOREM' , 1985 .
[16] Cosma Rohilla Shalizi,et al. Nonparametric Risk Bounds for Time-Series Forecasting , 2012, J. Mach. Learn. Res..
[17] L. M. M.-T.. Theory of Probability , 1929, Nature.