Sharp oracle inequalities and slope heuristic for specification probabilities estimation in discrete random fields

We study the problem of estimating the one-point specification probabilities in non-necessary finite discrete random fields from partially observed independent samples. Our procedures are based on model selection by minimization of a penalized empirical criterion. The selected estimators satisfy sharp oracle inequalities in $L_{2}$-risk. We also obtain theoretical results on the slope heuristic for this problem, justifying the slope algorithm to calibrate the leading constant in the penalty. The practical performances of our methods are investigated in two simulation studies. We illustrate the usefulness of our approach by applying the methods to a multi-unit neuronal data from a rat hippocampus.

[1]  P. Massart,et al.  Concentration inequalities and model selection , 2007 .

[2]  A. Barron,et al.  APPROXIMATION OF DENSITY FUNCTIONS BY SEQUENCES OF EXPONENTIAL FAMILIES , 1991 .

[3]  Adrien Saumard The Slope Heuristics in Heteroscedastic Regression , 2011, 1104.1050.

[4]  Imre Csisz'ar,et al.  Consistent estimation of the basic neighborhood of Markov random fields , 2006, math/0605323.

[5]  Optimal model selection for density estimation of stationary data under various mixing conditions , 2009, 0911.1497.

[6]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[7]  Anil K. Jain,et al.  Markov Random Field Texture Models , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  O. Bousquet A Bennett concentration inequality and its application to suprema of empirical processes , 2002 .

[9]  Yuji Ikegaya,et al.  Scale-free topology of the CA3 hippocampal network: a novel method to analyze functional neuronal assemblies. , 2010, Biophysical journal.

[10]  Klaus Jansen,et al.  Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques , 2006, Lecture Notes in Computer Science.

[11]  Matthieu Lerasle,et al.  Optimal model selection for stationary data under various mixing conditions , 2009 .

[12]  C. Pfister,et al.  Global specifications and nonquasilocality of projections of Gibbs measures , 1997 .

[13]  G. Buzsáki,et al.  Theta Oscillations Provide Temporal Windows for Local Circuit Computation in the Entorhinal-Hippocampal Loop , 2009, Neuron.

[14]  Norio Matsuki,et al.  Circuit topology for synchronizing neurons in spontaneously active networks , 2010, Proceedings of the National Academy of Sciences.

[15]  Imre Csiszár,et al.  Context tree estimation for not necessarily finite memory processes, via BIC and MDL , 2005, IEEE Transactions on Information Theory.

[16]  A. Kolmogorov,et al.  Entropy and "-capacity of sets in func-tional spaces , 1961 .

[17]  R. Kass,et al.  Multiple neural spike train data analysis: state-of-the-art and future challenges , 2004, Nature Neuroscience.

[18]  Michael J. Berry,et al.  Weak pairwise correlations imply strongly correlated network states in a neural population , 2005, Nature.

[19]  P. Massart,et al.  Gaussian model selection , 2001 .

[20]  Hans-Otto Georgii,et al.  Gibbs Measures and Phase Transitions , 1988 .

[21]  M. Lerasle Optimal model selection in density estimation , 2009, 0910.1654.

[22]  Andrea Montanari,et al.  Which graphical models are difficult to learn? , 2009, NIPS.

[23]  Pascal Massart,et al.  Data-driven Calibration of Penalties for Least-Squares Regression , 2008, J. Mach. Learn. Res..

[24]  Adrien Saumard Nonasymptotic quasi-optimality of AIC and the slope heuristics in maximum likelihood estimation of density using histogram models , 2010 .

[25]  J. Besag Statistical analysis of dirty pictures , 1993 .

[26]  J. Woods Markov image modeling , 1976, 1976 IEEE Conference on Decision and Control including the 15th Symposium on Adaptive Processes.

[27]  Francis R. Bach,et al.  Data-driven calibration of linear estimators with minimal penalties , 2009, NIPS.

[28]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[29]  N. Verzelen Adaptive estimation of stationary Gaussian fields , 2009, 0901.2212.

[30]  Identifying interacting pairs of sites in infinite range Ising models , 2010 .

[31]  P. Massart,et al.  From Model Selection to Adaptive Estimation , 1997 .

[32]  P. Massart,et al.  Minimal Penalties for Gaussian Model Selection , 2007 .

[33]  Matthieu Lerasle,et al.  An oracle approach for interaction neighborhood estimation in random fields , 2010 .