Model selection based on Bayesian predictive densities and multiple data records

Bayesian predictive densities are used to derive model selection rules. The resulting rules hold for sets of data records where each record is composed of an unknown number of deterministic signals common to all the records and a stationary white Gaussian noise. To determine the correct model, the set of data records is partitioned into two disjoint subsets. One of the subsets is used for estimation of the model parameters and the remaining for validation of the model. Two proposed estimators for linear nested models are examined in detail and some of their properties derived. Optimal strategies for partitioning the data records into estimation and validation subsets are discussed and analytical comparisons with the information criterion A of Akaike (AIC) and the minimum description length (MDL) of Schwarz and Rissanen are carried out. The performance of the estimators and their comparisons with the AIC and MDL selection rules are illustrated by numerical simulations. The results show that the Bayesian selection rules outperform the popular AIC and MDL criteria. >

[1]  R. Fisher The Advanced Theory of Statistics , 1943, Nature.

[2]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[3]  T. T. Kadota,et al.  On the best finite set of linear observables for discriminating two Gaussian signals , 1967, IEEE Trans. Inf. Theory.

[4]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[5]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[6]  H. Akaike A new look at the statistical model identification , 1974 .

[7]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[8]  J. Aitchison Goodness of prediction fit , 1975 .

[9]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[10]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[11]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[12]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[13]  R. Kashyap A Bayesian comparison of different classes of dynamic models using empirical data , 1977 .

[14]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[15]  A. Atkinson Posterior probabilities for choosing a regression model , 1978 .

[16]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[17]  Rangasami L. Kashyap,et al.  Image data compression using autoregressive time series models , 1979, Pattern Recognit..

[18]  Urs E. Ruttimann,et al.  Compression of the ECG by Prediction or Interpolation and Entropy Encoding , 1979, IEEE Transactions on Biomedical Engineering.

[19]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[20]  Rangasami L. Kashyap,et al.  Optimal Choice of AR and MA Parts in Autoregressive Moving Average Models , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  L. Scharf,et al.  A Prony method for noisy data: Choosing the signal components and selecting the order in exponential signal models , 1984, Proceedings of the IEEE.

[22]  H. Clergeot Filter-order selection in adaptive maximum likelihood estimation , 1984, IEEE Trans. Inf. Theory.

[23]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[24]  Hong Wang,et al.  On the performance of signal-subspace processing- Part I: Narrow-band systems , 1986, IEEE Trans. Acoust. Speech Signal Process..

[25]  Hong Wang,et al.  On the theoretical performance of a class of estimators of the number of narrow-band sources , 1987, IEEE Trans. Acoust. Speech Signal Process..

[26]  S. Sclove Application of model-selection criteria to some problems in multivariate analysis , 1987 .

[27]  James P. Reilly,et al.  Statistical analysis of the performance of information theoretic criteria in the detection of the number of signals in array processing , 1989, IEEE Trans. Acoust. Speech Signal Process..

[28]  P.M. Djuric,et al.  Predictive probability as a criterion for model selection , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[29]  P. Djurić,et al.  Model selection by cross-validation , 1990, IEEE International Symposium on Circuits and Systems.

[30]  C. Mallows More comments on C p , 1995 .