A comparison of imputation techniques for internal preference mapping, using Monte Carlo simulation

Abstract The usual algorithm for internal preference mapping requires a complete set of observations, meaning the technique cannot be used to analyse trials based on incomplete block designs. A simulation study was carried out to compare techniques for imputing missing values under various conditions. Sets of simulated preference data with different characteristics were constructed. Monte Carlo simulation was used to create missing observations in these sets; the imputation techniques were applied to the data; and the results of preference mapping based on the imputed data compared to those from the complete data set. Convergence problems were found with two techniques. Analysis of variance revealed that effects on performance were dominated by the proportion of data missing, the level of noise in the data, and the size of the data set. Differences in performance among the three convergent imputation techniques were small; mean substitution is recommended, as it performed as well as more complex iterative techniques. The results were broadly confirmed by a similar study on a genuine set of preference data.

[1]  Ann C. Noble,et al.  COMPARISON OF SOURNESS OF ORGANIC ACID ANIONS AT EQUAL pH AND EQUAL TITRATABLE ACIDITY , 1986 .

[2]  S. F. Buck A Method of Estimation of Missing Values in Multivariate Data Suitable for Use with an Electronic Computer , 1960 .

[3]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[4]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[5]  Brian Everitt,et al.  Homogeneity analysis of incomplete data , 1986 .

[6]  Monique M. Raats,et al.  A NEW SIGNIFICANCE TEST FOR CONSENSUS IN GENERALIZED PROCRUSTES ANALYSIS , 1992 .

[7]  Brian Everitt,et al.  Principles of Multivariate Analysis , 2001 .

[8]  B. M. King,et al.  A STATISTICAL TEST OF CONSENSUS OBTAINED FROM GENERALIZED PROCRUSTES ANALYSIS OF SENSORY DATA , 1991 .

[9]  A. Bello,et al.  Choosing among imputation techniques for incomplete multivariate data: a simulation study , 1993 .

[10]  S. Zamir,et al.  Lower Rank Approximation of Matrices by Least Squares With Any Choice of Weights , 1979 .

[11]  W J Krzanowski,et al.  Missing value imputation in multivariate data using the singular value decomposition of a matrix , 1988 .

[12]  I. Spence,et al.  Single subject incomplete designs for nonmetric multidimensional scaling , 1974 .

[13]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[14]  Pascal Schlich,et al.  Uses of change-over designs and repeated measurements in sensory and consumer studies , 1993 .

[15]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[16]  Stef van Buuren,et al.  Imputation of missing categorical data by maximizing internal consistency , 1992 .

[17]  Donald B. Rubin,et al.  EM and beyond , 1991 .

[18]  O. P. Whelehan,et al.  USE OF INDIVIDUAL DIFFERENCES SCALING FOR SENSORY STUDIES: SIMULATED RECOVERY OF STRUCTURE UNDER VARIOUS MISSING VALUE RATES AND ERROR LEVELS , 1987 .

[19]  E. Beale,et al.  Missing Values in Multivariate Analysis , 1975 .