Generative Adversarial Networks: An Overview
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.
Adversarial Feature Learning
The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution. Intuitively, models trained to predict these semantic latent representations given data may serve as useful feature representations for auxiliary problems where semantics are relevant. However, in their existing form, GANs have no means of learning the inverse mapping -- projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse mapping, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning.
Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey
Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer vision applications. To avoid extensive cost of collecting and annotating large-scale datasets, as a subset of unsupervised learning methods, self-supervised learning methods are proposed to learn general image and video features from large-scale unlabeled data without using any human-annotated labels. This paper provides an extensive review of deep learning-based self-supervised general visual feature learning methods from images or videos. First, the motivation, general pipeline, and terminologies of this field are described. Then the common deep neural network architectures that used for self-supervised learning are summarized. Next, the schema and evaluation metrics of self-supervised learning methods are reviewed followed by the commonly used datasets for images, videos, audios, and 3D data, as well as the existing self-supervised visual feature learning methods. Finally, quantitative performance comparisons of the reviewed methods on benchmark datasets are summarized and discussed for both image and video feature learning. At last, this paper is concluded and lists a set of promising future directions for self-supervised visual feature learning.
neural network machine learning artificial neural network deep learning convolutional neural network convolutional neural natural language deep neural network speech recognition social media neural network model hidden markov model markov model deep neural medical image computer vision object detection image classification conceptual design generative adversarial network gaussian mixture model facial expression generative adversarial deep convolutional neural deep reinforcement learning network architecture adversarial network mutual information deep learning model speech recognition system deep convolutional cad system image denoising speech enhancement neural network architecture convolutional network facial expression recognition feedforward neural network expression recognition nash equilibrium domain adaptation single image loss function based on deep neural net deep learning method semi-supervised learning deep learning algorithm data augmentation neural networks based image super-resolution deep belief network deep network feature learning enhancement based image synthesi multilayer neural network unsupervised domain adaptation learning task latent space single image super-resolution conditional generative adversarial media service neural networks trained acoustic modeling theoretic analysi speech enhancement based conditional generative multi-layer neural network quantitative structure-activity relationship conversational speech information bottleneck generative adversarial net training deep neural noisy label training deep adversarial perturbation adversarial net generative network batch normalization convolutional generative adversarial social media service deep convolutional generative update rule adversarial neural network deep neural net sensing mri convolutional generative adversarial sample wasserstein gan machine-learning algorithm robust training ventral stream binary weight gan training train deep neural ventral visual pathway deep generative adversarial current speech recognition pre-trained deep neural analysi of tweets deep feedforward neural improving deep learning frechet inception distance training generative adversarial stimulus feature medical image synthesi training generative community intelligence acoustic input overcoming catastrophic forgetting social reporting networks reveal context-dependent deep neural deep compression ventral pathway weights and activation extremely noisy