Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension
暂无分享,去创建一个
[1] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[2] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory, Ser. A.
[3] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[4] Peter E. Hart,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[5] W. N. Wapnik,et al. Theorie der Zeichenerkennung , 1979 .
[6] P. Assouad. Densité et dimension , 1983 .
[7] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[8] R. Dudley. A course on empirical processes , 1984 .
[9] P. Massart. Rates of convergence in the central limit theorem for empirical processes , 1986 .
[10] Lawrence D. Jackel,et al. Large Automatic Learning, Rule Extraction, and Generalization , 1987, Complex Syst..
[11] M. Talagrand. Donsker classes of sets , 1988 .
[12] Alfredo De Santis,et al. Learning probabilistic prediction functions , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.
[13] David Haussler,et al. Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.
[14] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[15] Naftali Tishby,et al. Consistent inference of probabilities in layered networks: predictions and generalizations , 1989, International 1989 Joint Conference on Neural Networks.
[16] Michael J. Pazzani,et al. Average case analysis of empirical and explanation-based learning algorithms , 1989 .
[17] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[18] Andrew R. Barron,et al. Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.
[19] Vladimir Vovk,et al. Aggregating strategies , 1990, COLT '90.
[20] N. Littlestone. Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .
[21] Sompolinsky,et al. Learning from examples in large neural networks. , 1990, Physical review letters.
[22] Wray L. Buntine,et al. A theory of learning classification rules , 1990 .
[23] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[24] Wray L. Buntine,et al. Bayesian Back-Propagation , 1991, Complex Syst..
[25] Philip M. Long,et al. On-line learning of linear functions , 1991, STOC '91.
[26] David Haussler,et al. Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise , 1991, COLT '91.
[27] Balas K. Natarajan,et al. Probably Approximate Learning Over Classes of Distributions , 1992, SIAM J. Comput..
[28] John Shawe-Taylor,et al. Bounding Sample Size with the Vapnik-Chervonenkis Dimension , 1993, Discrete Applied Mathematics.
[29] Michael J. Pazzani,et al. A framework for average case analysis of conjunctive learning algorithms , 2004, Machine Learning.