An investigation of several typical model selection criteria for detecting the number of signals

Based on the problem of detecting the number of signals, this paper provides a systematic empirical investigation on model selection performances of several classical criteria and recently developed methods (including Akaike’s information criterion (AIC), Schwarz’s Bayesian information criterion, Bozdogan’s consistent AIC, Hannan-Quinn information criterion, Minka’s (MK) principal component analysis (PCA) criterion, Kritchman & Nadler’s hypothesis tests (KN), Perry & Wolfe’s minimax rank estimation thresholding algorithm (MM), and Bayesian Ying-Yang (BYY) harmony learning), by varying signal-to-noise ratio (SNR) and training sample size N. A family of model selection indifference curves is defined by the contour lines of model selection accuracies, such that we can examine the joint effect of N and SNR rather than merely the effect of either of SNR and N with the other fixed as usually done in the literature. The indifference curves visually reveal that all methods demonstrate relative advantages obviously within a region of moderate N and SNR. Moreover, the importance of studying this region is also confirmed by an alternative reference criterion by maximizing the testing likelihood. It has been shown via extensive simulations that AIC and BYY harmony learning, as well as MK, KN, and MM, are relatively more robust than the others against decreasing N and SNR, and BYY is superior for a small sample size.

[1]  Lei Xu,et al.  Theoretical Analysis and Comparison of Several Criteria on Linear Model Dimension Reduction , 2009, ICA.

[2]  H. Bozdogan Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions , 1987 .

[3]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[4]  Lei Xu,et al.  Parameterizations make different model selections: Empirical findings from factor analysis , 2011 .

[5]  Patrick J. Wolfe,et al.  Minimax Rank Estimation for Subspace Tracking , 2009, IEEE Journal of Selected Topics in Signal Processing.

[6]  Lei Xu,et al.  Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology , 2011 .

[7]  Wenyuan Xu,et al.  Analysis of the performance and sensitivity of eigendecomposition-based detectors , 1995, IEEE Trans. Signal Process..

[8]  Lei Xu,et al.  A comparative investigation on subspace dimension determination , 2004, Neural Networks.

[9]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[10]  David C. Hoyle,et al.  Automatic PCA Dimension Selection for High Dimensional Data and Small Sample Sizes , 2008 .

[11]  J. W. Silverstein,et al.  Eigenvalues of large sample covariance matrices of spiked population models , 2004, math/0408165.

[12]  H. L. Le Roy,et al.  Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV , 1969 .

[13]  L. Xu Bayesian Ying-Yang system, best harmony learning, and five action circling , 2010 .

[14]  D. Paul ASYMPTOTICS OF SAMPLE EIGENSTRUCTURE FOR A LARGE DIMENSIONAL SPIKED COVARIANCE MODEL , 2007 .

[15]  Charles M. Bishop Variational principal components , 1999 .

[16]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[17]  I. Johnstone High Dimensional Statistical Inference and Random Matrices , 2006, math/0611589.

[18]  James P. Reilly,et al.  Statistical analysis of the performance of information theoretic criteria in the detection of the number of signals in array processing , 1989, IEEE Trans. Acoust. Speech Signal Process..

[19]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[20]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[21]  H. Vincent Poor,et al.  Estimation of the number of sources in unbalanced arrays via information theoretic criteria , 2005, IEEE Transactions on Signal Processing.

[22]  Tom Minka,et al.  Automatic Choice of Dimensionality for PCA , 2000, NIPS.

[23]  Herman Rubin,et al.  Statistical Inference in Factor Analysis , 1956 .

[24]  Lei Xu,et al.  Machine learning and intelligence science: Sino-foreign interchange workshop IScIDE2010 (A) , 2011 .

[25]  Phillip A. Regalia,et al.  On the behavior of information theoretic criteria for model order selection , 2001, IEEE Trans. Signal Process..

[26]  H. Akaike A new look at the statistical model identification , 1974 .

[27]  Hagit Messer,et al.  Submitted to Ieee Transactions on Signal Processing Detection of Signals by Information Theoretic Criteria: General Asymptotic Performance Analysis , 2022 .

[28]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[29]  Tiee-Jian Wu,et al.  A comparative study of model selection criteria for the number of signals , 2008 .

[30]  S. D. Chatterji Proceedings of the International Congress of Mathematicians , 1995 .

[31]  B. Nadler,et al.  Determining the number of components in a factor model from limited noisy data , 2008 .

[32]  Lei Xu,et al.  Bayesian Ying Yang learning , 2007, Scholarpedia.

[33]  Alan Edelman,et al.  Sample Eigenvalue Based Detection of High-Dimensional Signals in White Noise Using Relatively Few Samples , 2007, IEEE Transactions on Signal Processing.

[34]  I. Johnstone On the distribution of the largest eigenvalue in principal components analysis , 2001 .