A Hybrid Pareto Mixture for Conditional Asymmetric Fat-Tailed Distributions

In many cases, we observe some variables X that contain predictive information over a scalar variable of interest Y, with (X, Y) pairs observed in a training set. We can take advantage of this information to estimate the conditional density p(Y|X=x). In this paper, we propose a conditional mixture model with hybrid Pareto components to estimate p(Y|X=x). The hybrid Pareto is a Gaussian whose upper tail has been replaced by a generalized Pareto tail. A third parameter, in addition to the location and spread parameters of the Gaussian, controls the heaviness of the upper tail. Using the hybrid Pareto in a mixture model results in a nonparametric estimator that can adapt to multimodality, asymmetry, and heavy tails. A conditional density estimator is built by modeling the parameters of the mixture estimator as functions of X. We use a neural network to implement these functions. Such conditional density estimators have important applications in many domains such as finance and insurance. We show experimentally that this novel approach better models the conditional density in terms of likelihood, compared to competing algorithms: conditional mixture models with other types of components and a classical kernel-based nonparametric model.

[1]  John Law,et al.  Robust Statistics—The Approach Based on Influence Functions , 1986 .

[2]  A. McNeil Estimating the Tails of Loss Severity Distributions Using Extreme Value Theory , 1997, ASTIN Bulletin.

[3]  Vartan Choulakian,et al.  Goodness-of-Fit Tests for the Generalized Pareto Distribution , 2001, Technometrics.

[4]  C. Klüppelberg,et al.  Modelling Extremal Events , 1997 .

[5]  Thomas Mikosch,et al.  How to model multivariate extremes if one must? , 2005 .

[6]  Gaston H. Gonnet,et al.  On the LambertW function , 1996, Adv. Comput. Math..

[7]  PAUL EMBRECHTS,et al.  Modelling of extremal events in insurance and finance , 1994, Math. Methods Oper. Res..

[8]  A. McNeil,et al.  Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach , 2000 .

[9]  J. Pickands Statistical Inference Using Extreme Order Statistics , 1975 .

[10]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Halbert White,et al.  Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings , 1990, Neural Networks.

[13]  Rob J. Hyndman,et al.  Bandwidth selection for kernel conditional density estimation , 2001 .

[14]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[15]  C. Priebe Adaptive Mixtures , 2010 .

[16]  Richard L. Smith,et al.  Models for exceedances over high thresholds , 1990 .

[17]  Yoshua Bengio,et al.  A hybrid Pareto model for asymmetric fat-tailed data: the univariate case , 2009 .

[18]  Jim Georges,et al.  KDD'99 competition: knowledge discovery contest , 2000, SKDD.

[19]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.