Model selection through a statistical analysis of the minimum of a weighted least squares cost function

Abstract Combining (i) a statistical interpretation of the minimum of a Weighted Least Squares cost function and (ii) the principle of parsimony, a model selection strategy is proposed. First, it is compared via simulation to model selection methods based on information criteria (AIC and MDL type). The first kind of simulations shows that the cost function approach outperforms in selecting the true model, especially when the number of data is very small compared with the number of parameters to be estimated. Next, the model metaselection proposed by de Luna and Skouras [X. De Luna, K. Skouras, Choosing a model selection strategy, Scand. J. Stat. 30(1) (2003) 113–128.] is employed as an objective method to choose the best model selection method. Applied to one of their examples, clearly the cost function strategy is selected as the best method. Finally, on a set of field data, the cost function approach is used for selecting the relevant parameters of a complex model.

[1]  Yves Rolain,et al.  Modified AIC rule for model selection in combination with prior estimated noise models , 2002, Autom..

[2]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[3]  Malcolm R. Forster,et al.  Simplicity, Inference and Modelling: The new science of simplicity , 2002 .

[4]  Arnold Zellner,et al.  Simplicity, Inference and Modelling , 2002 .

[5]  Rik Pintelon,et al.  Modified AIC and MDL model selection criteria for short data records , 2004, IEEE Transactions on Instrumentation and Measurement.

[6]  M. Stone An Asymptotic Equivalence of Choice of Model by Cross‐Validation and Akaike's Criterion , 1977 .

[7]  W. Baeyens,et al.  Reliability of N flux rates estimated from 15N enrichment and dilution experiments in aquatic systems , 2005 .

[8]  W. Baeyens,et al.  Contribution of nitrate to the uptake of nitrogen by phytoplankton in an ocean margin environment , 2004, Hydrobiologia.

[9]  I. A. Kieseppä AIC and Large Samples , 2003, Philosophy of Science.

[10]  W. Baeyens,et al.  N uptake conditions during summer in the Subantarctic and Polar Frontal Zones of the Australian sector of the Southern Ocean , 2002 .

[11]  H. Akaike A new look at the statistical model identification , 1974 .

[12]  Woojae Kim,et al.  Flexibility versus generalizability in model selection , 2003, Psychonomic bulletin & review.

[13]  Xavier de Luna,et al.  Choosing a Model Selection Strategy , 2003 .

[14]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[15]  J. Schoukens,et al.  Refined parameter and uncertainty estimation when both variables are subject to error. Case study: estimation of Si consumption and regeneration rates in a marine environment , 2005 .

[16]  M. J. Box Improved Parameter Estimation , 1970 .

[17]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[18]  Vladimír Rod,et al.  Iterative estimation of model parameters when measurements of all variables are subject to error , 1980 .