Maximum-likelihood learning of cumulative distribution functions on graphs

For many applications, a probability model can be more easily expressed as a cumulative distribution functions (CDF) as compared to the use of probability density or mass functions (PDF/PMFs). One advantage of CDF models is the simplicity of representing multivariate heavy-tailed distributions. Examples of fields that can benefit from the use of graphical models for CDFs include climatology and epidemiology, where data follow heavy-tailed distributions and exhibit spatial correlations so that dependencies between model variables must be accounted for. However, in most cases the problem of learning from data consists of optimizing the log-likelihood function with respect to model parameters where we are required to optimize a log-PDF/PMF and not a log-CDF. Given a CDF defined on a graph, we present a message-passing algorithm called the gradient-derivative-product (GDP) algorithm that allows us to learn the model in terms of the log-likelihood function whereby messages correspond to local gradients of the likelihood with respect to model parameters. We demonstrate the GDP algorithm on real-world rainfall and H1N1 mortality data and we show that the heavy-tailed multivariate distributions that arise in these problems can both be naturally parameterized and tractably estimated from data using our algorithm.