The use of simplified or misspecified models : Linear case

Simplified models have many appealing properties and sometimes give better parameter estimates and model predictions, in sense of mean-squared-error, than extended models, especially when the data are not informative. In this paper, we summarize extensive quantitative and qualitative results in the literature concerned with using simplified or misspecified models. Based on confidence intervals and hypothesis tests, we develop a practical strategy to help modellers decide whether a simplified model should be used, and point out the difficulty in making such a decision. We also evaluate several methods for statistical inference for simplified or misspecified models. Les modeles simplifies ont des proprietes interessantes et presentent parfois de meilleures estimations de parametres et predictions de modeles, pour ce qui est de l'erreur quadratique moyenne, que les modeles plus elabores, en particulier lorsque les donnees ne sont pas de type informatif. Nous presentons dans cet article un resume d'un grand nombre de resultats quantitatifs et qualitatifs de la litterature scientifique portant sur des modeles simplifies ou mal specifies. En nous appuyant sur des intervalles de confiance et des essais d'hypotheses, nous etablissons une strategie pratique afin d'aider les concepteurs de modeles a determiner s'ils doivent employer un modele simplifie et attirer leur attention sur la difficulte de prendre une telle decision. Nous evaluons egalement plusieurs methodes d'inference statistique pour des modeles simplifies ou mal specifies.

[1]  T. D. Wallace,et al.  Efficiencies for Stepwise Regressions , 1964 .

[2]  Thomas D. Waite,et al.  A simplified model for trace organics removal by continuous flow PAC adsorption/submerged membrane processes , 2005 .

[3]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[4]  Robert L. Mason,et al.  Biased Estimation in Regression: An Evaluation Using Mean Squared Error , 1977 .

[5]  Angel R. Martinez,et al.  Computational Statistics Handbook with MATLAB , 2001 .

[6]  James V. Beck,et al.  Parameter Estimation in Engineering and Science , 1977 .

[7]  Phillip I. Good,et al.  Resampling Methods: A Practical Guide to Data Analysis , 2005 .

[8]  H. White Consequences and Detection of Misspecified Nonlinear Regression Models , 1981 .

[9]  M. Romdhane,et al.  The kinetic modelling of a steam distillation unit for the extraction of aniseed (Pimpinella anisum) essential oil , 2005 .

[10]  Götz Trenkler,et al.  Mean square error matrix comparisons of optimal and classical predictors and estimators in linear regression , 1990 .

[11]  Daniel R. Lewin,et al.  Model-based Control of Fuel Cells: (1) Regulatory Control , 2004 .

[12]  Arthur S. Goldberger,et al.  Stepwise Least Squares: Residual Analysis and Specification Error , 1961 .

[13]  Douglas C. Montgomery,et al.  Applied Statistics and Probability for Engineers, Third edition , 1994 .

[14]  Richard M. Golden Making correct statistical inferences using a wrong probability model , 1995 .

[15]  Fuzhen Zhang Matrix Theory: Basic Results and Techniques , 1999 .

[16]  Anthony C. Davison,et al.  Bootstrap Methods and Their Application , 1998 .

[17]  Chuang-zhi Wu,et al.  A Kinetic Study on Biomass Fast Catalytic Pyrolysis , 2004 .

[18]  D. G. Kabe Stepwise Multivariate Linear Regression , 1963 .

[19]  P. Rao,et al.  Some Notes on Misspecification in Multiple Regressions , 1971 .

[20]  David W. Bacon,et al.  Mathematical Model and Parameter Estimation for Gas-Phase Ethylene Homopolymerization with Supported Metallocene Catalyst , 2005 .

[21]  David W. Bacon,et al.  Mathematical Model and Parameter Estimation for Gas-Phase Ethylene/Hexene Copolymerization With Metallocene Catalyst , 2005 .

[22]  Roger J. Brooks,et al.  Choosing the best model: Level of detail, complexity, and model performance , 1996 .

[23]  Juergen Hahn,et al.  Parameter reduction for stable dynamical systems based on Hankel singular values and sensitivity analysis , 2006 .

[24]  N. Draper,et al.  Applied Regression Analysis , 1967 .

[25]  Robert B. Schnabel,et al.  Computational Experience With Confidence Regions and Confidence Intervals for Nonlinear Least Squares , 1986 .

[26]  Douglas M. Bates,et al.  Nonlinear Regression Analysis and Its Applications , 1988 .

[27]  Tadayoshi Fushiki Bootstrap prediction and Bayesian prediction under misspecified models , 2005 .

[28]  Michael R. Chernick,et al.  Bootstrap Methods: A Practitioner's Guide , 1999 .

[29]  Calyampudi Radhakrishna Rao,et al.  Statistics for the 21st Century : Methodologies for Applications of the Future , 2000 .

[30]  James M. Lowerre On the Mean Square Error of Parameter Estimates for Some Biased Estimators , 1974 .

[31]  George S. Innis,et al.  Simulation model simplification techniques , 1983 .

[32]  J. H. Steiger,et al.  Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis. , 2004, Psychological methods.

[33]  Alan J. Miller,et al.  Subset Selection in Regression , 1991 .

[34]  Eric Rexstad,et al.  Model simplification — Three applications , 1985 .

[35]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[36]  Elizabeth A. Peck,et al.  Introduction to Linear Regression Analysis , 2001 .

[37]  Lourens J. Waldorp,et al.  Goodness-of-fit and confidence intervals of approximate models , 2006 .

[38]  J. Perregaard,et al.  Model simplification and reduction for simulation and optimization of chemical processes , 1993 .

[39]  Dominique Bonvin,et al.  Incremental Identification of Kinetic Models for Homogeneous Reaction Systems , 2006 .

[40]  Hiroyuki Yoshida,et al.  A Simplified Reaction Model for Production of Oil, Amino Acids, and Organic Acids from Fish Meat by Hydrolysis under Sub-Critical and Supercritical Conditions , 2003 .

[41]  L. Waldorp,et al.  The Wald test and Crame/spl acute/r-Rao bound for misspecified models in electromagnetic source analysis , 2005, IEEE Transactions on Signal Processing.

[42]  Arthur S. Goldberger,et al.  Note on Stepwise Least Squares , 1961 .

[43]  P. Levy,et al.  A characterization on misspecification in the general linear regression model. , 1971, Biometrics.

[44]  M. Bagajewicz,et al.  Data Reconciliation in Gas Pipeline Systems , 2003 .

[45]  Gerda Claeskens,et al.  Bootstrap tests for misspecified models, with application to clustered binary data , 2001 .

[46]  J. M. Price Comparisons among regression estimators under the generalized mean square error criterion , 1982 .

[47]  On the bootstrap in misspecified regression models , 2001 .

[48]  Gheorghe Maria,et al.  A Review of Algorithms and Trends in Kinetic Model Identification for Chemical and Biochemical Systems , 2004 .

[49]  Mahmood Moshfeghian,et al.  A simplified method for calculating saturated liquid densities , 2004 .

[50]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[51]  D. Dunson,et al.  Performance of tests of association in misspecified generalized linear models , 2006 .

[52]  C. Raghavendra Rao,et al.  On model selection , 2001 .

[53]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[54]  Shein-Chung Chow,et al.  Advanced Linear Models: Theory and Applications , 1993 .

[55]  T. D. Wallace,et al.  A Test of the Mean Square Error Criterion for Restrictions in Linear Regression , 1968 .