Prediction of Euclidean distances with discrete and continuous outcomes

The objective of this paper is first to predict generalized Euclidean distances in the context of discrete and quantitative variables and then to derive their statistical properties. We first consider the simultaneous modelling of discrete and continuous random variables with covariates and obtain the likelihood. We derive an important property useful for its practical maximization. We then study the prediction of any Euclidean distances and its statistical proprieties, especially for the Mahalanobis distance. The quality of distance estimation is analyzed through simulations. This results are applied to our motivating example: the official distinction procedure of rapeseed varieties.

[1]  H. Joe,et al.  The Estimation Method of Inference Functions for Margins for Multivariate Models , 1996 .

[2]  T. W. Anderson An Introduction to Multivariate Statistical Analysis , 1959 .

[3]  A. Genz Numerical Computation of Multivariate Normal Probabilities , 1992 .

[4]  Charles E. Brown Multivariate Probit Analysis , 1998 .

[5]  Jean-Jacques Daudin,et al.  Generalization of the Mahalanobis distance in the mixed case , 1995 .

[6]  H. Joe Multivariate models and dependence concepts , 1998 .

[7]  E. Lehmann,et al.  Nonparametrics: Statistical Methods Based on Ranks , 1976 .

[8]  Wai-Yin Poon,et al.  Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients , 1988 .

[9]  A. R. de Leon Pairwise likelihood approach to grouped continuous model and its extension , 2005 .

[10]  W. Krzanowski Distance between populations using mixed continuous and categorical variables , 1983 .

[11]  F. Mortier,et al.  Multivariate dynamic model for ordinal outcomes , 2008 .

[12]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[13]  A. R. de Leon,et al.  A generalized Mahalanobis distance for mixed data , 2005 .

[14]  J. Ashford,et al.  Multi-variate probit analysis. , 1970, Biometrics.

[15]  G. Nuel,et al.  Predicting distances using a linear model: The case of varietal distinctness , 2001 .

[16]  E J Bedrick,et al.  Estimating the Mahalanobis Distance from Mixed Continuous and Discrete Data , 2000, Biometrics.

[17]  V. Lombard,et al.  Genetic Relationships and Fingerprinting of Rapeseed Cultivars by AFLP: Consequences for Varietal Registration , 2000 .

[18]  Ingram Olkin,et al.  Multivariate Correlation Models with Mixed Discrete and Continuous Variables , 1961 .

[19]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics (Revised Edition) , 1999 .