Investigating omitted variable bias in regression parameter estimation: A genetic algorithm approach

Bias in regression estimates resulting from the omission of a correlated relevant variable is a well-known phenomenon. In this study, we apply a genetic algorithm to estimate the missing variable and, using that estimated variable, demonstrate that significant bias in regression estimates can be substantially corrected with relatively high confidence in effective models. Our interest is restricted to the case of a missing binary indicator variable and the analytical properties of bias and MSE dominance of the resulting dependent error generated vector process. These findings are compared to prior results for the independent error proxy process. Simulations are run for medium sample sizes and the method is shown to produce substantial reduction in estimation bias and often renders useful estimates of the missing vector. Limited simulations for the continuous variable case are reported and indicate some potential for the method and future research.

[1]  W. E. Wecker,et al.  Correcting for Omitted-Variables and Measurement-Error Bias in Regression with an Application to the Effect of Lead on IQ , 1998 .

[2]  J. B. Ramsey,et al.  Tests for Specification Errors in Classical Linear Least‐Squares Regression Analysis , 1969 .

[3]  S. Chatterjee Sensitivity analysis in linear regression , 1988 .

[4]  Gilbert Syswerda,et al.  Uniform Crossover in Genetic Algorithms , 1989, ICGA.

[5]  Bennett T. McCallum,et al.  Relative Asymptotic Bias from Errors of Omission and Measurement , 1972 .

[6]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[7]  Bart W. Stuck,et al.  A Computer and Communication Network Performance Analysis Primer (Prentice Hall, Englewood Cliffs, NJ, 1985; revised, 1987) , 1987, Int. CMG Conference.

[8]  Dennis J. Aigner MSE dominance of least squares with errors-of-observation , 1974 .

[9]  Alyson G. Wilson,et al.  Finding Near-Optimal Bayesian Experimental Designs via Genetic Algorithms , 2001 .

[10]  J. A. Díaz-García,et al.  SENSITIVITY ANALYSIS IN LINEAR REGRESSION , 2022 .

[11]  M. Wickens A Note on the Use of Proxy Variables , 1972 .

[12]  S. Weisberg,et al.  Residuals and Influence in Regression , 1982 .

[13]  J. David Schaffer,et al.  Proceedings of the third international conference on Genetic algorithms , 1989 .

[14]  Randy L. Haupt,et al.  Practical Genetic Algorithms , 1998 .

[15]  Chang Wook Ahn,et al.  On the practical genetic algorithms , 2005, GECCO '05.

[16]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .