Mixture Density Networks

p(t | x) t x x x = 0.8 = 0.5 = 0.2 Figure 7: Plot of the conditional probability densities of the target data, for various values of x, obtained by taking vertical slices through the contours in Figure 6, for x = 0:2, x = 0:5 and x = 0:8. It is clear that the Mixture Density Network is able to capture correctly the multimodal nature of the target data density function at intermediate values of x. x Figure 8: Plot of the priors i (x) as a function of x for the 3 kernel functions from the same Mixture Density Network as was used to plot Figure 6. At both small and large values of x, where the conditional probability density of the target data is unimodal, only one of the kernels has a prior probability which diiers signiicantly from zero. At intermediate values of x, where the conditional density is tri-modal, the three kernels have comparable priors.