A resampling approach for interval‐valued data regression

We consider interval-valued data that frequently appear with advanced technologies in current data collection processes. Interval-valued data refer to the data that are observed as ranges instead of single values. In the last decade, several approaches to the regression analysis of interval-valued data have been introduced, but little work has been done on relevant statistical inferences concerning the regression model. In this paper, we propose a new approach to fit a linear regression model to interval-valued data using a resampling idea. A key advantage is that it enables one to make inferences on the model such as the overall model significance test and individual coefficient test. We demonstrate the proposed approach using simulated and real data examples, and also compare its performance with those of existing methods. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012 © 2012 Wiley Periodicals, Inc.

[1]  L. Billard,et al.  Symbolic Regression Analysis , 2002 .

[2]  Lynne Billard Brief overview of symbolic data and analytic issues , 2011, Stat. Anal. Data Min..

[3]  Monique Noirhomme-Fraiture,et al.  Far beyond the classical data models: symbolic data analysis , 2011, Stat. Anal. Data Min..

[4]  L. Billard,et al.  Regression Analysis for Interval-Valued Data , 2000 .

[5]  P. Good Resampling Methods , 1999, Birkhäuser Boston.

[6]  Francisco de A. T. de Carvalho,et al.  Centre and Range method for fitting a linear regression model to symbolic interval data , 2008, Comput. Stat. Data Anal..

[7]  Wei Xu Symbolic data analysis , 2010 .

[8]  Francisco de A. T. de Carvalho,et al.  Bivariate Generalized Linear Model for Interval-Valued Variables , 2009, 2009 International Joint Conference on Neural Networks.

[9]  P. Bertrand,et al.  Descriptive Statistics for Symbolic Data , 2000 .

[10]  Silva,et al.  A Regression Model to Interval-valued Variables based on Copula Approach , 2011 .

[11]  Edwin Diday,et al.  Probabilist, possibilist and belief objects for knowledge analysis , 1995, Ann. Oper. Res..

[12]  Francisco de A. T. de Carvalho,et al.  Constrained linear regression models for symbolic interval-valued variables , 2010, Comput. Stat. Data Anal..

[13]  G. Cordeiro,et al.  Bivariate symbolic regression models for interval-valued variables , 2011 .

[14]  Edwin Diday,et al.  Capacities and Credibilities in Analysis of Probabilistic Objects , 1996 .

[15]  E. Diday An Introduction to symbolic data analysis , 1993 .

[16]  Francisco de A. T. de Carvalho,et al.  Univariate and Multivariate Linear Regression Methods to Predict Interval-Valued Features , 2004, Australian Conference on Artificial Intelligence.

[17]  Francisco de A. T. de Carvalho,et al.  Fitting a Least Absolute Deviation Regression Model on Interval-Valued Data , 2008, SBIA.

[18]  Edwin Diday,et al.  Symbolic Linear Regression with Taxonomies , 2004 .

[19]  Edwin Diday,et al.  Symbolic Data Analysis: Conceptual Statistics and Data Mining (Wiley Series in Computational Statistics) , 2007 .

[20]  Lynne Billard,et al.  Dependencies and Variation Components of Symbolic Interval-Valued Data , 2007 .

[21]  L. Billard,et al.  From the Statistics of Data to the Statistics of Knowledge , 2003 .