论文信息 - Data dependent weak universal redundancy

Data dependent weak universal redundancy

We are motivated by applications that need rich model classes to represent the application, such as the set of all discrete distributions over large, countably infinite supports. But such rich classes may be too complex to admit estimators that converge to the truth with convergence rates that can be uniformly bounded over the entire model class as the sample size increases (uniform consistency). However, these rich classes may still allow for estimators with pointwise guarantees whose performance can be bounded in a model-dependent way. But the pointwise angle has a drawback—estimator performance is a function of the very unknown model that is being estimated, and is unknown. Therefore, even if an estimator is consistent, how well it is doing may not be clear no matter what the sample size. Departing from the uniform/pointwise dichotomy, a new analysis framework is explored by characterizing rich model classes that may only admit pointwise guarantees, yet all information about the unknown model needed to gauge estimator accuracy can be inferred from the sample at hand. To bring focus, we analyze the universal compression problem in this data derived, pointwise consistency framework.

[1] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[2] Peter L. Bartlett,et al. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results , 2003, J. Mach. Learn. Res..

[3] Narayana Santhanam. Probability estimation and compression involving large alphabets , 2006 .

[4] Andrew R. Barron,et al. Minimax redundancy for the class of memoryless sources , 1997, IEEE Trans. Inf. Theory.

[5] Venkat Anantharam,et al. Agnostic insurability of model classes , 2012, J. Mach. Learn. Res..

[6] A. Barron,et al. Jeffreys' prior is asymptotically least favorable under entropy risk , 1994 .

[7] Vladimir Koltchinskii,et al. Rademacher penalties and structural risk minimization , 2001, IEEE Trans. Inf. Theory.

[8] Michael Drmota,et al. Precise minimax redundancy and regret , 2004, IEEE Transactions on Information Theory.

[9] Peter L. Bartlett,et al. Model Selection and Error Estimation , 2000, Machine Learning.