This paper introduces two new approaches to fit a linear regression model on interval-valued data Each example of the learning set is described by a feature vector where each feature value is an interval In the first proposed approach, it is fitted two independent linear regression models, respectively, on the mid-point and range of the interval values assumed by the variables on the learning set In the second approach, is fitted a multivariate linear regression models on these mid-point and range The prediction of the lower and upper bound of the interval value of the dependent variable is accomplished from its mid-point and range which are estimated from the fitted linear regression models applied to the mid-point and range of each interval values of the independent variables The evaluation of the proposed prediction methods is based on the average behavior of the root mean squared error and the determination coefficient in the framework of a Monte Carlo experiment in comparison with the method proposed by Billard and Diday [2].
[1]
Hans-Hermann Bock,et al.
Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data
,
2000
.
[2]
Elizabeth A. Peck,et al.
Introduction to Linear Regression Analysis
,
2001
.
[3]
N. Draper,et al.
Applied Regression Analysis
,
1966
.
[4]
L. Billard,et al.
From the Statistics of Data to the Statistics of Knowledge
,
2003
.
[5]
L. Billard,et al.
Regression Analysis for Interval-Valued Data
,
2000
.
[6]
L. Billard,et al.
Symbolic Regression Analysis
,
2002
.