In recent years, impressive progress has been made in the design of implicit probabilistic models via Generative Adversarial Networks (GAN) and its extension, the Conditional GAN (CGAN). Excellent solutions have been demonstrated mostly in image processing applications which involve large, continuous output spaces. There is almost no application of these powerful tools to problems having small dimensional output spaces. Regression problems involving the inductive learning of a map, $y=f(x,z)$, $z$ denoting noise, $f:\mathbb{R}^n\times \mathbb{R}^k \rightarrow \mathbb{R}^m$, with $m$ small (e.g., $m=1$ or just a few) is one good case in point. The standard approach to solve regression problems is to probabilistically model the output $y$ as the sum of a mean function $m(x)$ and a noise term $z$; it is also usual to take the noise to be a Gaussian. These are done for convenience sake so that the likelihood of observed data is expressible in closed form. In the real world, on the other hand, stochasticity of the output is usually caused by missing or noisy input variables. Such a real world situation is best represented using an implicit model in which an extra noise vector, $z$ is included with $x$ as input. CGAN is naturally suited to design such implicit models. This paper makes the first step in this direction and compares the existing regression methods with CGAN.
We notice however, that the existing methods like mixture density networks (MDN) and XGBoost do quite well compared to CGAN in terms of likelihood and mean absolute error, respectively. Both these methods are comparatively easier to train than CGANs. CGANs need more innovation to have a comparable modeling and ease-of-training with respect to the existing regression solvers. In summary, for modeling uncertainty MDNs are better while XGBoost is better for the cases where accurate prediction is more important.
[1]
Francis W. Zwiers,et al.
Climate Change Detection and Attribution: Beyond Mean Temperature Signals
,
2006
.
[2]
Sebastian Nowozin,et al.
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
,
2016,
NIPS.
[3]
C. Bishop.
Mixture density networks
,
1994
.
[4]
Léon Bottou,et al.
Wasserstein GAN
,
2017,
ArXiv.
[5]
Nasser M. Nasrabadi,et al.
Pattern Recognition and Machine Learning
,
2006,
Technometrics.
[6]
Carl E. Rasmussen,et al.
In Advances in Neural Information Processing Systems
,
2011
.
[7]
Carl E. Rasmussen,et al.
Warped Gaussian Processes
,
2003,
NIPS.
[8]
Lawrence Carin,et al.
Adversarial Time-to-Event Modeling
,
2018,
ICML.
[9]
Yoshua Bengio,et al.
Generative Adversarial Nets
,
2014,
NIPS.
[10]
T. Kneib.
Beyond mean regression
,
2013
.
[11]
Jean-Luc Dugelay,et al.
Face aging with conditional generative adversarial networks
,
2017,
2017 IEEE International Conference on Image Processing (ICIP).
[12]
Yoshua Bengio,et al.
Better Mixing via Deep Representations
,
2012,
ICML.
[13]
Alexander J. Smola,et al.
Heteroscedastic Gaussian process regression
,
2005,
ICML.
[14]
G. Tutz,et al.
Modelling beyond regression functions: an application of multimodal regression to speed–flow data
,
2006
.
[15]
Simon Osindero,et al.
Conditional Generative Adversarial Nets
,
2014,
ArXiv.
[16]
Sepp Hochreiter,et al.
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
,
2015,
ICLR.
[17]
Yiming Yang,et al.
MMD GAN: Towards Deeper Understanding of Moment Matching Network
,
2017,
NIPS.