Evaluating modified generalized information criterion in presence of multicollinearity

ABSTRACT When there are many explanatory variables in the regression model, there is a chance that some of these are intercorrelated. This is where the problem of multicollinearity creeps in due to which precision and accuracy of the coefficients is marred, and the quest to find the best model becomes tedious. To tackle such a situation, Model selection criteria are applied for selecting the best model that fits the data. Current study focuses on the evaluation of the four unmodified and four modified versions of generalized information criteria—Akaike Information Criterion, Schwarz's Bayes Information Criteria, Hannan-Quinn Information Criterion, and Akaike Information Criterion corrected for small samples. A simulation study using SAS software was carried out in order to compare the unmodified and modified versions of the generalized information criteria and to discover the best version amongst the four modified model selection criteria, for identifying the best model, when the collinearity assumption is violated. For the proposed simulation, two samples of size 50 and 100, for three explanatory variables X1, X2, and X3, are drawn from Normal distribution. Two situations of collinearity violations between X1 and X2 are looked into, first when ρ = 0.6 and second when ρ = 0.8. The outcomes of the simulations are displayed in the tables along with visual representations. The results revealed that modified versions of the generalized information criteria are more sensitive in identifying models marred with high multicollinearity as compared to the unmodified generalized information criteria.

[1]  Arnold Zellner,et al.  Simplicity, Inference and Modelling: Keeping it Sophisticatedly Simple , 2009 .

[2]  Jan de Leeuw,et al.  Introducing Multilevel Modeling , 1998 .

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  David R. Anderson,et al.  Model selection and multimodel inference : a practical information-theoretic approach , 2003 .

[5]  Çagdas Hakan Aladag,et al.  A new model selection strategy in artificial neural networks , 2008, Appl. Math. Comput..

[6]  R. R. Hocking The analysis and selection of variables in linear regression , 1976 .

[7]  A. Sulthan,et al.  Identification of multicollinearity and it’s effect in model selection , 2014 .

[8]  Ali Hussein Al-Marshadi,et al.  COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS , 2014 .

[9]  Jacob Cohen,et al.  Applied multiple regression/correlation analysis for the behavioral sciences , 1979 .

[10]  G. Judge,et al.  The Theory and Practice of Econometrics , 1981 .

[11]  R. Frisch Statistical confluence analysis by means of complete regression systems , 1934 .

[12]  Peter Schmidt,et al.  The Theory and Practice of Econometrics , 1985 .

[13]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[14]  R. Allen,et al.  Statistical Confluence Analysis by means of Complete Regression Systems , 1935 .

[15]  Takamitsu Sawa,et al.  Information criteria for discriminating among alternative regression models / BEBR No. 455 , 1978 .

[16]  Erol Egrioglu,et al.  Improving weighted information criterion by using optimization , 2010, J. Comput. Appl. Math..

[17]  A. Goldberger A course in econometrics , 1991 .