Minimum complexity density estimation

The authors introduce an index of resolvability that is proved to bound the rate of convergence of minimum complexity density estimators as well as the information-theoretic redundancy of the corresponding total description length. The results on the index of resolvability demonstrate the statistical effectiveness of the minimum description-length principle as a method of inference. The minimum complexity estimator converges to true density nearly as fast as an estimator based on prior knowledge of the true subclass of densities. Interpretations and basic properties of minimum complexity estimators are discussed. Some regression and classification problems that can be examined from the minimum description-length framework are considered. >

[1]  L. M. M.-T. Theory of Probability , 1929, Nature.

[2]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[3]  H. Coxeter,et al.  Covering space with equal spheres , 1959 .

[4]  C. A. Rogers Lattice coverings of space , 1959 .

[5]  A. Kolmogorov,et al.  Entropy and "-capacity of sets in func-tional spaces , 1961 .

[6]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[7]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[8]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[9]  L. Goddard Approximation of Functions , 1965, Nature.

[10]  Gregory J. Chaitin,et al.  On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.

[11]  M. Birman,et al.  PIECEWISE-POLYNOMIAL APPROXIMATIONS OF FUNCTIONS OF THE CLASSES $ W_{p}^{\alpha}$ , 1967 .

[12]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[13]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[14]  D. A. Bell,et al.  Information Theory and Reliable Communication , 1969 .

[15]  T. Cover A HIERARCHY OF PROBABILITY DENSITY FUNCTION ESTIMATES , 1972 .

[16]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[17]  Lee D. Davisson,et al.  Universal noiseless coding , 1973, IEEE Trans. Inf. Theory.

[18]  G. Chaitin A Theory of Program Size Formally Identical to Information Theory , 1975, JACM.

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[21]  R. Shibata An optimal selection of regression variables , 1981 .

[22]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[23]  R. Sorkin A quantitative occam's razor , 1983, astro-ph/0511780.

[24]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[25]  Jorma Rissanen,et al.  Universal coding, information, prediction, and estimation , 1984, IEEE Trans. Inf. Theory.

[26]  T. Cover Kolmogorov Complexity, Data Compression, and Inference , 1985 .

[27]  Y. Yatracos Rates of Convergence of Minimum Distance Estimators and Kolmogorov's Entropy , 1985 .

[28]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[29]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[30]  Ker-Chau Li,et al.  Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set , 1987 .

[31]  E. Hannan,et al.  On stochastic complexity and nonparametric density estimation , 1988 .

[32]  D. Cox Approximation of Least Squares Regression on Nested Subspaces , 1988 .

[33]  V. V'yugin On the Defect of Randomness of a Finite Object with Respect to Measures with Given Complexity Bounds , 1988 .

[34]  B. Clarke Asymptotic cumulative risk and Bayes risk under entropy loss, with applications , 1989 .

[35]  A. Barron,et al.  Statistical properties of artificial neural networks , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[36]  A. Barron Uniformly Powerful Goodness of Fit Tests , 1989 .

[37]  P. Gács,et al.  KOLMOGOROV'S CONTRIBUTIONS TO INFORMATION THEORY AND ALGORITHMIC COMPLEXITY , 1989 .

[38]  Andrew R. Barron,et al.  Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.

[39]  W. Fischer,et al.  Sphere Packings, Lattices and Groups , 1990 .

[40]  A. Barron,et al.  APPROXIMATION OF DENSITY FUNCTIONS BY SEQUENCES OF EXPONENTIAL FAMILIES , 1991 .