Regressions in practice can include outliers and other unknown subpopulation structure. For example, mixtures of regressions occur if there is an omitted categorical predictor like gender, species or location and di erent regressions occur within each category. A lurking variable that has an important e ect but is not present among the predictors under consideration (Box 1966) can seriously complicate a regression analyses. Regression structure with lurking variables is illustrated in Figure 1a which is a stylized representation of subpopulation structures in a regression with response Y predictors Xk. The contours A, C and E represent di erent subpopulation regressions. Point B represents an isolated outlier while the circular contours D represent an outlying cluster. The regression illustrated in the gure consists of a mixture of ve distinct regressions, one for each of the four subpopulations and one for the isolated outlier.
[1]
R. H. Moore,et al.
Regression Graphics: Ideas for Studying Regressions Through Graphics
,
1998,
Technometrics.
[2]
Ker-Chau Li,et al.
On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma
,
1992
.
[3]
Sanford Weisberg,et al.
Graphs in Statistical Analysis: Is the Medium the Message?
,
1999
.
[4]
G. Box.
Use and Abuse of Regression
,
1966
.
[5]
Ker-Chau Li.
Sliced inverse regression for dimension reduction (with discussion)
,
1991
.
[6]
D. Hawkins.
Multivariate Statistics: A Practical Approach
,
1990
.