Constrained Regression for Interval-Valued Data

Current regression models for interval-valued data do not guarantee that the predicted lower bound of the interval is always smaller than its upper bound. We propose a constrained regression model that preserves the natural order of the interval in all instances, either for in-sample fitted intervals or for interval forecasts. Within the framework of interval time series, we specify a general dynamic bivariate system for the upper and lower bounds of the intervals. By imposing the order of the interval bounds into the model, the bivariate probability density function of the errors becomes conditionally truncated. In this context, the ordinary least squares (OLS) estimators of the parameters of the system are inconsistent. Estimation by maximum likelihood is possible but it is computationally burdensome due to the nonlinearity of the estimator when there is truncation. We propose a two-step procedure that combines maximum likelihood and least squares estimation and a modified two-step procedure that combines maximum likelihood and minimum-distance estimation. In both instances, the estimators are consistent. However, when multicollinearity arises in the second step of the estimation, the modified two-step procedure is superior at identifying the model regardless of the severity of the truncation. Monte Carlo simulations show good finite sample properties of the proposed estimators. A comparison with the current methods in the literature shows that our proposed methods are superior by delivering smaller losses and better estimators (no bias and low mean squared errors) than those from competing approaches. We illustrate our approach with the daily interval of low/high SP500 returns and find that truncation is very severe during and after the financial crisis of 2008, so OLS estimates should not be trusted and a modified two-step procedure should be implemented. Supplementary materials for this article are available online.

[1]  Michael T. Owyang,et al.  Multivariate Forecast Evaluation and Rationality Testing , 2007, Review of Economics and Statistics.

[2]  Paulo M.M. Rodrigues,et al.  Modeling and Forecasting Interval Time Series with Threshold Models: An Application to S&P500 Index Returns , 2011 .

[3]  Javier Arroyo,et al.  Forecasting with Interval and Histogram Data. Some Financial Applications , 2011 .

[4]  Francisco de A. T. de Carvalho,et al.  Constrained linear regression models for symbolic interval-valued variables , 2010, Comput. Stat. Data Anal..

[5]  Andrew J. Patton,et al.  Correction to “Automatic Block-Length Selection for the Dependent Bootstrap” by D. Politis and H. White , 2009 .

[6]  Francisco de A. T. de Carvalho,et al.  Centre and Range method for fitting a linear regression model to symbolic interval data , 2008, Comput. Stat. Data Anal..

[7]  V. Chernozhukov,et al.  QUANTILE AND PROBABILITY CURVES WITHOUT CROSSING , 2007, 0704.3649.

[8]  Edwin Diday,et al.  Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics) , 2007 .

[9]  Paula Brito Modelling and Analysing Interval Data , 2006, GfKl.

[10]  Sílvia Gonçalves,et al.  Bootstrap Standard Error Estimates for Linear Regression , 2005 .

[11]  H. White,et al.  Automatic Block-Length Selection for the Dependent Bootstrap , 2004 .

[12]  L. Billard,et al.  From the Statistics of Data to the Statistics of Knowledge , 2003 .

[13]  Paul A. Ruud,et al.  On the uniqueness of the maximum likelihood estimator , 2002 .

[14]  George G. Judge,et al.  Econometric foundations , 2000 .

[15]  Christopher Winship,et al.  Sample Selection Bias , 2000 .

[16]  L. Billard,et al.  Regression Analysis for Interval-Valued Data , 2000 .

[17]  Edwin Diday,et al.  Symbolic Data Analysis: A Mathematical Framework and Tool for Data Mining , 1999, Electron. Notes Discret. Math..

[18]  Douglas G. Steigerwald,et al.  Asymptotic Bias for Quasi-Maximum-Likelihood Estimators in Conditional Heteroskedasticity Models , 1997 .

[19]  W. Newey,et al.  Automatic Lag Selection in Covariance Matrix Estimation , 1994 .

[20]  Halbert White,et al.  Estimation, inference, and specification analysis , 1996 .

[21]  William H. Greene,et al.  Multiple roots of the Tobit log-likelihood , 1990 .

[22]  C. Orme A Note on the Uniqueness of the Maximum Likelihood Estimator in the Truncated Regression Model , 1989 .

[23]  Jeffrey M. Wooldridge,et al.  Some Invariance Principles and Central Limit Theorems for Dependent Heterogeneous Processes , 1988, Econometric Theory.

[24]  W. Newey,et al.  A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelationconsistent Covariance Matrix , 1986 .

[25]  W. Newey,et al.  Large sample estimation and hypothesis testing , 1986 .

[26]  J. Heckman Sample selection bias as a specification error , 1979 .

[27]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[28]  D. McLeish Dependent Central Limit Theorems and Invariance Principles , 1974 .

[29]  Takeshi Amemiya,et al.  Regression Analysis when the Dependent Variable is Truncated Normal , 1973 .

[30]  G. Baikunth Nath MOMENTS OF A LINEARLY TRUNCATED BIVARIATE NORMAL DISTRIBUTION1 , 1972 .

[31]  R. Jennrich Asymptotic Properties of Non-Linear Least Squares Estimators , 1969 .

[32]  J. Tobin Estimation of Relationships for Limited Dependent Variables , 1958 .