Painless Embeddings of Distributions: the Function Space View (Part 1)

whereK is the (fixed) number of mixture component, π is a vector ofmixing weights, andp(x|k) are the densities for each component. We consider some examples below. 2 Gaussian mixture models Consider the dataset of height and weight in Figure 1. It is cl ear that there are two subpopulations in this data set, and in this case they are easy to interpret: one represents ma le and the other females. Within each class or cluster, the data is fairly well represented by a 2D Gaussian (as can be seen from the fitted ellipses), but to model the data as a whole, we need to use a mixture of Gaussians (MoG)or aGaussian mixture model (GMM). This is defined as follows: p(x|θ) = K

[1]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[2]  E. Giné,et al.  On the Bootstrap of $U$ and $V$ Statistics , 1992 .

[3]  D. Edwards Introduction to graphical modelling , 1995 .

[4]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[5]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[6]  R. Fortet,et al.  Convergence de la répartition empirique vers la répartition théorique , 1953 .

[7]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[8]  Kenji Fukumizu,et al.  Consistency of Kernel Canonical Correlation Analysis , 2005 .

[9]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[10]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[11]  N. H. Anderson,et al.  Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates , 1994 .

[12]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[13]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[14]  Dudley,et al.  Real Analysis and Probability: Integration , 2002 .

[15]  Bernhard Schölkopf,et al.  A kernel-based causal learning algorithm , 2007, ICML '07.

[16]  Bernhard Schölkopf,et al.  Kernel Methods for Measuring Independence , 2005, J. Mach. Learn. Res..

[17]  C. Granger Investigating Causal Relations by Econometric Models and Cross-Spectral Methods , 1969 .

[18]  Michael I. Jordan,et al.  Kernel dimension reduction in regression , 2009, 0908.1854.

[19]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .