Model Order Selection Based on Information Theoretic Criteria: Design of the Penalty

Information theoretic criteria (ITC) have been widely adopted in engineering and statistics for selecting among an ordered set of candidate models the one that better fits the observed sample data. The selected model minimizes a penalized likelihood metric, where the penalty is determined by the criterion adopted. While rules for choosing a penalty that guarantees a consistent estimate of the model order are known, theoretical tools for its design with finite samples have never been provided in a general setting. In this paper, we study model order selection for finite samples under a design perspective, focusing on the generalized information criterion (GIC), which embraces the most common ITC. The theory is general, and as case studies we consider: a) the problem of estimating the number of signals embedded in additive white Gaussian noise (AWGN) by using multiple sensors; b) model selection for the general linear model (GLM), which includes, e.g., the problem of estimating the number of sinusoids in AWGN. The analysis reveals a trade-off between the probabilities of overestimating and underestimating the order of the model. We then propose to design the GIC penalty to minimize underestimation while keeping the overestimation probability below a specified level. For the considered problems this method leads to analytical derivation of the optimal penalty for a given sample size. A performance comparison between the penalty optimized GIC and common AIC and BIC is provided, demonstrating the effectiveness of the proposed design strategy.

[1]  Andrea Giorgetti,et al.  Designing ITC selection algorithms for wireless sources enumeration , 2015, 2015 IEEE International Conference on Communications (ICC).

[2]  Y. Selen,et al.  Model-order selection: a review of information criterion rules , 2004, IEEE Signal Processing Magazine.

[3]  Hirotugu Akaike,et al.  On Newer Statistical Approaches to Parameter Estimation and Structure Determination , 1978 .

[4]  Z. Bai,et al.  On detection of the number of signals when the noise covariance matrix is arbitrary , 1986 .

[5]  R. Muirhead Aspects of Multivariate Statistical Theory , 1982, Wiley Series in Probability and Statistics.

[6]  Andrea Giorgetti,et al.  Wideband Spectrum Sensing by Model Order Selection , 2015, IEEE Transactions on Wireless Communications.

[7]  Petar M. Djuric,et al.  A model selection rule for sinusoids in white Gaussian noise , 1996, IEEE Trans. Signal Process..

[8]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[9]  Hagit Messer,et al.  Submitted to Ieee Transactions on Signal Processing Detection of Signals by Information Theoretic Criteria: General Asymptotic Performance Analysis , 2022 .

[10]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .

[11]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[12]  Boaz Nadler,et al.  Non-Parametric Detection of the Number of Signals: Hypothesis Testing and Random Matrix Theory , 2009, IEEE Transactions on Signal Processing.

[13]  G. Casella,et al.  Consistency of Bayesian procedures for variable selection , 2009, 0904.2978.

[14]  Wenyuan Xu,et al.  Analysis of the performance and sensitivity of eigendecomposition-based detectors , 1995, IEEE Trans. Signal Process..

[15]  James P. Reilly,et al.  Statistical analysis of the performance of information theoretic criteria in the detection of the number of signals in array processing , 1989, IEEE Trans. Acoust. Speech Signal Process..

[16]  R. Shibata Selection of the order of an autoregressive model by Akaike's information criterion , 1976 .

[17]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[19]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[20]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[21]  Marco Chiani,et al.  Distribution of the largest eigenvalue for real Wishart and Gaussian random matrices and a simple approximation for the Tracy-Widom distribution , 2012, J. Multivar. Anal..

[22]  Andrea Giorgetti,et al.  Test of independence for cooperative spectrum sensing with uncalibrated receivers , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[23]  Boaz Nadler,et al.  Nonparametric Detection of Signals by Information Theoretic Criteria: Performance Analysis and an Improved Estimator , 2010, IEEE Transactions on Signal Processing.

[24]  Hong Wang,et al.  On the theoretical performance of a class of estimators of the number of narrow-band sources , 1987, IEEE Trans. Acoust. Speech Signal Process..

[25]  Boon Poh Ng,et al.  A CFAR based model order selection criterion for complex sinusoids , 2006, Signal Process..

[26]  Andrea Giorgetti,et al.  Time-of-Arrival Estimation Based on Information Theoretic Criteria , 2013, IEEE Transactions on Signal Processing.

[27]  ByoungSeon Choi,et al.  Arma Model Identification , 1992 .

[28]  R. Bhansali,et al.  Some properties of the order of an autoregressive model selected by a generalization of Akaike∘s EPF criterion , 1977 .

[29]  Robert V. Foutz,et al.  The Performance of the Likelihood Ratio Test When the Model is Incorrect , 1977 .

[30]  Andrea Giorgetti,et al.  Spectrum holes detection by information theoretic criteria , 2011, CogART '11.

[31]  Phillip A. Regalia,et al.  On the behavior of information theoretic criteria for model order selection , 2001, IEEE Trans. Signal Process..

[32]  Moe Z. Win,et al.  Estimating the number of signals observed by multiple sensors , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[33]  Marco Chiani,et al.  Reduced Complexity Power Allocation Strategies for MIMO Systems With Singular Value Decomposition , 2012, IEEE Transactions on Vehicular Technology.

[34]  A. Atkinson A note on the generalized information criterion for choice of a model , 1980 .

[35]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[36]  Aryeh Kontorovich,et al.  Model Selection for Sinusoids in Noise: Statistical Analysis and a New Penalty Term , 2011, IEEE Transactions on Signal Processing.

[37]  P. Massart,et al.  Minimal Penalties for Gaussian Model Selection , 2007 .

[38]  Ramon van Handel,et al.  Consistent Order Estimation and Minimal Penalties , 2010, IEEE Transactions on Information Theory.

[39]  Andrea Giorgetti,et al.  Wideband spectrum sensing for cognitive radio: A model order selection approach , 2014, 2014 IEEE International Conference on Communications (ICC).

[40]  J. Rissanen An Introduction to the MDL Principle , 2005 .

[41]  Moe Z. Win,et al.  On the marginal distribution of the eigenvalues of wishart matrices , 2009, IEEE Transactions on Communications.

[42]  H. Akaike A new look at the statistical model identification , 1974 .

[43]  C. Anderson‐Cook,et al.  An Introduction to Multivariate Statistical Analysis (3rd ed.) (Book) , 2004 .

[44]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[45]  Runze Li,et al.  Regularization Parameter Selections via Generalized Information Criterion , 2010, Journal of the American Statistical Association.

[46]  Petar M. Djuric,et al.  Asymptotic MAP criteria for model selection , 1998, IEEE Trans. Signal Process..

[47]  Walter Zucchini,et al.  Model Selection , 2011, International Encyclopedia of Statistical Science.

[48]  Marco Chiani,et al.  Recent Advances on Wideband Spectrum Sensing for Cognitive Radio , 2014 .

[49]  Z. Bai,et al.  On detection of the number of signals in presence of white noise , 1985 .

[50]  R. Nishii Asymptotic Properties of Criteria for Selection of Variables in Multiple Regression , 1984 .

[51]  R. Nishii Maximum likelihood principle and model selection when the true model is unspecified , 1988 .

[52]  J. Ghosh,et al.  AIC, BIC and Recent Advances in Model Selection , 2011 .