Crowd Counting with Minimal Data Using Generative Adversarial Networks for Multiple Target Regression

In this work, we use a generative adversarial network (GAN) to train crowd counting networks using minimal data. We describe how GAN objectives can be modified to allow for the use of unlabeled data to benefit inference training in semi-supervised learning. More generally, we explain how these same methods can be used in more generic multiple regression target semi-supervised learning, with crowd counting being a demonstrative example. Given a convolutional neural network (CNN) with capabilities equivalent to the discriminator in the GAN, we provide experimental results which show that our GAN is able to outperform the CNN even when the CNN has access to significantly more labeled data. This presents the potential of training such networks to high accuracy with little data. Our primary goal is not to outpreform the state-of-the-art using an improved method on the entire dataset, but instead we work to show that through semi-supervised learning we can reduce the data required to train an inference network to a given accuracy. To this end, systematic experiments are performed with various numbers of images and cameras to show under which situations the semi-supervised GANs can improve results.

[1]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[2]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[3]  Xiangmin Xu,et al.  Multi-scale convolutional neural networks for crowd counting , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[4]  Lin Chen,et al.  An end-to-end generative adversarial network for crowd counting under complicated scenes , 2017, 2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB).

[5]  Noel E. O'Connor,et al.  ResnetCrowd: A residual deep learning architecture for crowd counting, violent behaviour detection and crowd density level classification , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[8]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.