Exact influence measures are applied in the evaluation of a principal component decomposition for high dimensional data. Some data used for classifying samples of rice from their near infra-red transmission profiles, following a preliminary principal component analysis, are examined in detail. A normalization of eigenvalue influence statistics is proposed which ensures that measures reflect the relative orientations of observations, rather than their overall Euclidean distance from the sample mean. Thus, the analyst obtains more information from an analysis of eigenvalues than from approximate approaches to eigenvalue influence. This is particularly important for high dimensional data where a complete investigation of eigenvector perturbations may be cumbersome. The results are used to suggest a new class of influence measures based on ratios of Euclidean distances in orthogonal spaces.
[1]
G. McLachlan.
Discriminant Analysis and Statistical Pattern Recognition
,
1992
.
[2]
L. Gleser.
Measurement, Regression, and Calibration
,
1996
.
[3]
H. Nyquist,et al.
Effects on the eigenstructure of a data matrix when deleting an observation
,
1991
.
[4]
I. Jolliffe.
Principal Component Analysis
,
2002
.
[5]
Bell Telephone,et al.
ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA
,
1972
.
[6]
F. Critchley.
Influence in principal components analysis
,
1985
.
[7]
J. H. Wilkinson.
The algebraic eigenvalue problem
,
1966
.
[8]
Michael Thompson,et al.
The Authentication of Basmati Rice Using near Infrared Spectroscopy
,
1993
.