论文信息 - Training data augmentation: An empirical study using generative adversarial net-based approach with normalizing flow models for materials informatics

Training data augmentation: An empirical study using generative adversarial net-based approach with normalizing flow models for materials informatics

Abstract We address the issue of small data size for training models for regression problems, which is a significant issue in materials science. Many density estimators that use generative models based on deep neural networks have been proposed. With generative models, normalizing flows can provide exact density estimations. Using normalizing flows, we address training data augmentation issue, where we use a real-valued non-volume preserving model (real-NVP) as the normalizing flow. A generative adversarial net (GAN)-based training method is applied to improve real-NVP training using real-NVP as the generator. Using kernel ridge regression trained by generated data, generalization performance was measured for evaluating the models. Experiments were conducted with seven benchmark datasets and a dataset of ionic conductivity of materials to compare the GAN-based real-NVP to state-of-the-art models, such as real-NVP and masked autoregressive flows. The experimental results demonstrated that the GAN-based real-NVP was comparable to state-of-the-art models and implied that the data sampled by the GAN-based real-NVP were available as new training data.

Hiroshi Ohno | H. Ohno

[1] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[2] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[3] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[4] Hugo Larochelle,et al. Neural Autoregressive Distribution Estimation , 2016, J. Mach. Learn. Res..

[5] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[6] Chiho Kim,et al. Machine learning in materials informatics: recent applications and prospects , 2017, npj Computational Materials.

[7] Ying Zhang,et al. A strategy to apply machine learning to small datasets in materials science , 2018, npj Computational Materials.

[8] Zhe Gan,et al. Learning Deep Sigmoid Belief Networks with Data Augmentation , 2015, AISTATS.

[9] Surya Ganguli,et al. Analyzing noise in autoencoders and deep networks , 2014, ArXiv.

[10] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[11] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.