Parametric estimation. Finite sample theory

The paper aims at reconsidering the famous Le Cam LAN theory. The main features of the approach which make it different from the classical one are: (1) the study is non-asymptotic, that is, the sample size is xed and does not tend to infinity; (2) the parametric assumption is possibly misspecified and the underlying data distribution can lie beyond the given parametric family. The main results include a large deviation bounds for the (quasi) maximum likelihood and the local quadratic majorization of the log-likelihood process. The latter yields a number of important corollaries for statistical inference - concentration, confidence and risk bounds, expansion of the maximum likelihood estimate, etc. All these corollaries are stated in a non-classical way admitting a model misspecification and finite samples. However, the classical asymptotic results including the efficiency bounds can be easily derived as corollaries of the obtained non-asymptotic statements. The general results are illustrated for the i.i.d. set-up as well as for generalized linear and median estimation. The results apply for any dimension of the parameter space and provide a quantitative lower bound on the sample size yielding the root-n accuracy. We also discuss the procedures which allows to recover the structure when its e ective dimension is unknown.

[1]  Le Cam,et al.  Locally asymptotically normal families of distributions : certain approximations to families of distributions & thier use in the theory of estimation & testing hypotheses , 1960 .

[2]  R. Z. Khasʹminskiĭ,et al.  Statistical estimation : asymptotic theory , 1981 .

[3]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[4]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[5]  P. Massart,et al.  Rates of convergence for minimum contrast estimators , 1993 .

[6]  S. Geer Hellinger-Consistency of Certain Nonparametric Maximum Likelihood Estimators , 1993 .

[7]  John A. Nelder,et al.  Generalized linear models. 2nd ed. , 1993 .

[8]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[9]  M. Talagrand Majorizing measures: the generic chaining , 1996 .

[10]  P. Massart,et al.  Minimum contrast estimators on sieves: exponential bounds and rates of convergence , 1998 .

[11]  M. Talagrand Majorizing measures without measures , 2001 .

[12]  P. MassartLedoux,et al.  Concentration Inequalities Using the Entropy Method , 2002 .

[13]  T. N. Sriram Asymptotics in Statistics–Some Basic Concepts , 2002 .

[14]  M. Talagrand The Generic Chaining , 2005 .

[15]  M. Talagrand The Generic chaining : upper and lower bounds of stochastic processes , 2005 .

[16]  A theorem on majorizing measures , 2005, math/0510373.

[17]  L. Birge,et al.  Model selection via testing: an alternative to (penalized) maximum likelihood estimators , 2006 .

[18]  Y. Baraud A Bernstein-type inequality for suprema of random processes with an application to statistics , 2009, 0904.3295.

[19]  Weining Wang,et al.  Local Quantile Regression , 2010, 1208.5384.

[20]  Vladimir Spokoiny,et al.  Penalized maximum likelihood estimation and effective dimension , 2012, 1205.0498.

[21]  Roughness penalty, Wilks Phenomenon, and Bernstein - von Mises Theorem , 2012 .