Kernel-Based Hybrid Random Fields for Nonparametric Density Estimation

Hybrid random fields are a recently proposed graphical model for pseudo-likelihood estimation in discrete domains. In this paper, we develop a continuous version of the model for nonparametric density estimation. To this aim, Nadaraya-Watson kernel estimators are used to model the local conditional densities within hybrid random fields. First, we introduce a heuristic algorithm for tuning the kernel bandwidhts in the conditional density estimators. Second, we propose a novel method for initializing the structure learning algorithm originally employed for hybrid random fields, which was meant instead for discrete variables. In order to test the accuracy of the proposed technique, we use a number of synthetic pattern classification benchmarks, generated from random distributions featuring nonlinear correlations between the variables. As compared to state-of-the-art nonparametric and semiparametric learning techniques for probabilistic graphical models, kernel-based hybrid random fields regularly outperform each considered alternative in terms of recognition accuracy, while preserving the scalability properties (with respect to the number of variables) that originally motivated their introduction.

[1]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[2]  E. Nadaraya On Estimating Regression , 1964 .

[3]  Xiaohai Sun,et al.  Distribution-Free Learning of Bayesian Network Structure , 2008, ECML/PKDD.

[4]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[5]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  Volker Tresp,et al.  Discovering Structure in Continuous Variables Using Bayesian Networks , 1995, NIPS.

[8]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[9]  Marco Gori,et al.  Scalable pseudo-likelihood estimation in hybrid random fields , 2009, KDD.

[10]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[11]  Martin J. Wainwright,et al.  Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of l1-regularized MLE , 2008, NIPS.

[12]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[13]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[14]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[15]  Dimitris Margaritis,et al.  Distribution-Free Learning of Bayesian Network Structure in Continuous Domains , 2005, AAAI.

[16]  Alexander G. Gray,et al.  Fast Nonparametric Conditional Density Estimation , 2007, UAI.

[17]  J. Kenney,et al.  Mathematics of statistics , 1940 .

[18]  Michael I. Jordan,et al.  Learning Graphical Models with Mercer Kernels , 2002, NIPS.

[19]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[20]  Volker Tresp,et al.  Nonlinear Markov Networks for Continuous Variables , 1997, NIPS.

[21]  Andrew W. Moore,et al.  'N-Body' Problems in Statistical Learning , 2000, NIPS.

[22]  G. S. Watson,et al.  Smooth regression analysis , 1964 .

[23]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[24]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[25]  Bin Yu,et al.  Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of boldmathell_1-regularized MLE , 2008, NIPS 2008.

[26]  Marco Gori,et al.  A hybrid random field model for scalable statistical learning , 2009, Neural Networks.

[27]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .