Tests for Specification Errors in Classical Linear Least‐Squares Regression Analysis

SUMMARY The effects on the distribution of least-squares residuals of a series of model mis-specifications are considered. It is shown that for a variety of specification errors the distributions of the least-squares residuals are normal, but with non-zero means. An alternative predictor of the disturbance vector is used in developing four procedures for testing for the presence of specification error. The specification errors considered are omitted variables, incorrect functional form, simultaneous equation problems and heteroskedasticity. THE objectives of this paper are two. The first is to derive the distributions of the classical linear least-squares residuals under a variety of specification errors. The errors considered are omitted variables, incorrect functional form, simultaneous equation problems and heteroskedasticity. It is assumed that the disturbance terms are independently and normally distributed. It will be shown that the effect of the specification errors considered above is, with the exception of the error of heteroskedasticity, to yield residuals which though normally distributed do not have zero means, so that the distribution of the squared residuals is non-central x2. The second objective is to derive procedures to test for the presence of the specification errors considered in the first part of the paper. The tests are developed by comparing the distribution of residuals under the hypothesis that the specification of the model is correct to the distribution of the residuals yielded under the alternative hypothesis that there is a specification error of one of the types considered in the first part of the paper. As a preliminary step to deriving the test procedures the classical least-squares residual vector is transformed to a sub-vector which has more desirable properties for testing the null hypothesis that the specification of the model is correct. Also, under certain assumptions, with respect to the alternative hypothesis, it is shown that the mean vector of the residuals can be approximated by a linear sum of vectors qj,

[1]  A. Wald The Fitting of Straight Lines if Both Variables are Subject to Error , 1940 .

[2]  T. Haavelmo The Statistical Implications of a System of Simultaneous Equations , 1943 .

[3]  T. A. Bancroft,et al.  On Biases in Estimation Due to the Use of Preliminary Tests of Significance , 1944 .

[4]  D. V. Lindley,et al.  Regression Lines and the Linear Functional Relationship , 1947 .

[5]  N. Smirnov Table for Estimating the Goodness of Fit of Empirical Distributions , 1948 .

[6]  M. S. Bartlett,et al.  Fitting a Straight Line When Both Variables are Subject to Error , 1949 .

[7]  J. Tukey One Degree of Freedom for Non-Additivity , 1949 .

[8]  M. Kendall,et al.  Regression, structure and functional relationship. Part I. , 1951, Biometrika.

[9]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[10]  J. Durbin,et al.  Testing for serial correlation in least squares regression. II. , 1950, Biometrika.

[11]  Z. Birnbaum Numerical Tabulation of the Distribution of Kolmogorov's Statistic for Finite Sample Size , 1952 .

[12]  Calyampudi R. Rao,et al.  Advanced Statistical Methods in Biometric Research. , 1953 .

[13]  A. Stuart Asymptotic Relative Efficiencies of Distribution-Free Tests of Randomness Against Normal Alternatives , 1954 .

[14]  James Durbin,et al.  Errors in variables , 1954 .

[15]  John R. Meyer,et al.  Correlation and Regression Estimates when the Data are Ratios , 1955 .

[16]  J. Kiefer,et al.  On Tests of Normality and Other Tests of Goodness of Fit Based on Distance Methods , 1955 .

[17]  S. J. Prais,et al.  The analysis of family budgets , 1955 .

[18]  M. S. Raff On Approximating the Point Binomial , 1956 .

[19]  S. J. Prais,et al.  The Analysis of Family Budgets , 1956 .

[20]  A. Stuart The Efficiencies of Tests of Randomness Against Normal Regression , 1956 .

[21]  Zvi Griliches,et al.  Specification Bias in Estimates of Production Functions , 1957 .

[22]  A. Madansky The fitting of straight lines when both variables are subject to error , 1959 .

[23]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[24]  T. Liu,et al.  Underidentification, Structural Estimation, and Forecasting , 1960 .

[25]  C. Hildreth Simultaneous Equations: Any Verdict Yet? , 1960 .

[26]  C. Christ A Symposium on Simultaneous Equation Estimation: Simultaneous Equation Estimation: Any Verdict Yet? , 1960 .

[27]  A. Goldberger,et al.  On Pure and Mixed Statistical Estimation in Economics , 1961 .

[28]  Franklin M. Fisher,et al.  On the Cost of Approximate Specification in Simultaneous Equation Estimation , 1961 .

[29]  F. J. Anscombe,et al.  Examination of Residuals , 1961 .

[30]  G. Box,et al.  Transformation of the Independent Variables , 1962 .

[31]  A. Zellner An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias , 1962 .

[32]  H. Theil,et al.  Three-Stage Least Squares: Simultaneous Estimation of Simultaneous Equations , 1962 .

[33]  F. J. Anscombe,et al.  The Examination and Analysis of Residuals , 1963 .

[34]  M. Kendall,et al.  The advanced theory of statistics , 1945 .

[35]  H. J. Larson,et al.  Biases in prediction by regression for certain incompletely specified models , 1963 .

[36]  H. J. Larson,et al.  Sequential Model Building for Prediction in Regression Analysis, I , 1963 .

[37]  D. Cox,et al.  An Analysis of Transformations , 1964 .

[38]  G. C. Tiao,et al.  A note on criterion robustness and inference robustness , 1964 .

[39]  S. Goldfeld,et al.  Some Tests for Homoscedasticity , 1965 .

[40]  Calyampudi R. Rao,et al.  Advanced Statistical Methods in Biometric Research. , 1953 .