Regularized non-parametric multivariate density and conditional density estimation

In this paper, a distance-based method for both multivariate non-parametric density and conditional density estimation is proposed. The contributions are the formulation of both density estimation problems as weight optimization problems for Gaussian mixtures centered about samples with identical parameters. Furthermore, the minimization is based on the modified Cramér-von Mises distance of the Localized Cumulative Distributions, removing the ambiguity of the definition of the multivariate cumulative distribution function. The minimization problem is amended with a regularization term penalizing the densities' roughness to avoid overfitting. The resulting estimation problems for both densities and conditional densities are shown to be phrasable in the form of readily implementable quadratic programs. Experimental comparison against EM, SVR, and GPR based on the log-likelihood and performance in benchmark recursive filtering applications show high quality of the densities and good performance at less computational cost, i.e., the density representations are sparser.

[1]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2]  O. C. Schrempf,et al.  Evaluation of hybrid Bayesian networks using analytical density representations , 2005 .

[3]  Dieter Fox,et al.  GP-UKF: Unscented kalman filters with Gaussian process prediction and observation models , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[7]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[8]  Jeffrey K. Uhlmann,et al.  Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.

[9]  Darryl Morrell,et al.  Implementation of Continuous Bayesian Networks Using Sums of Weighted Gaussians , 1995, UAI.

[10]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[11]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[12]  Uwe D. Hanebeck,et al.  Support-vector conditional density estimation for nonlinear filtering , 2010, 2010 13th International Conference on Information Fusion.

[13]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[14]  Uwe D. Hanebeck,et al.  Localized Cumulative Distributions and a multivariate generalization of the Cramér-von Mises distance , 2008, 2008 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[15]  P. Eggermont,et al.  Maximum penalized likelihood estimation , 2001 .

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[18]  Uwe D. Hanebeck,et al.  Analytic moment-based Gaussian process filtering , 2009, ICML '09.

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[21]  B. Silverman,et al.  Maximum Penalized Likelihood Estimation , 2006 .

[22]  Uwe D. Hanebeck,et al.  Dirac mixture approximation of multivariate Gaussian densities , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[23]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[24]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.