A Minimax Approach for Classification with Big-data

In this paper, a novel methodology to reduce the generalization errors occurring due to domain shift in big data classification is presented. This reduction is achieved by introducing a suitably selected domain shift to the training data via what is referred to as "distortion model". These distortions are introduced through an affine transformation and additional data-samples are obtained. Next, a deep neural network (NN), referred as "classifier", is used to classify both the original and the additional data samples. By learning from both the original and additional data-samples, the classifier compensates for the domain shift while maintaining its performance on original data. However, as the exact magnitude of the shift one would encounter in real applications is unknown a priori and difficult to predict. The objective is to compensate for the optimal shift that can be introduced by the distortion model without significantly degrading the performance of the model. A two-player zero-sum game is thus designed where the first player is the distortion model with the aim of increasing the domain shift. The classifier then becomes the second player whose aim is to minimize the impact of domain shift. Finally, a direct error-driven learning scheme is utilized to minimize the impact of the classifier while maximizing the domain shift. A comprehensive simulation study is presented where a 12% improvement in the presence of domain shift is demonstrated. The proposed approach is also shown to improve generalization by 6%.

[1]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[2]  Gregory Ditzler,et al.  Learning in Nonstationary Environments: A Survey , 2015, IEEE Computational Intelligence Magazine.

[3]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[4]  Daniel Cremers,et al.  Associative Domain Adaptation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[5]  Sarangapani Jagannathan,et al.  Mahalanobis Taguchi System (MTS) as a Prognostics Tool for Rolling Element Bearing Failures , 2010 .

[6]  Tatsuya Harada,et al.  Maximum Classifier Discrepancy for Unsupervised Domain Adaptation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[8]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[9]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[10]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[11]  Noboru Murata,et al.  Neural Network with Unbounded Activation Functions is Universal Approximator , 2015, 1505.03654.

[12]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[13]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[14]  Pascal Fua,et al.  Beyond Sharing Weights for Deep Domain Adaptation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Han Liu,et al.  Challenges of Big Data Analysis. , 2013, National science review.

[16]  Peter Corcoran,et al.  Smart Augmentation Learning an Optimal Data Augmentation Strategy , 2017, IEEE Access.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  A. Colman Game Theory and its Applications: In the Social and Biological Sciences , 1995 .

[19]  Ming-Yu Liu,et al.  Coupled Generative Adversarial Networks , 2016, NIPS.

[20]  Tatsuya Harada,et al.  Asymmetric Tri-training for Unsupervised Domain Adaptation , 2017, ICML.

[21]  Jianqing Fan,et al.  Endogeneity in High Dimensions. , 2012, Annals of statistics.

[22]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Liang Gao,et al.  A New Deep Transfer Learning Based on Sparse Auto-Encoder for Fault Diagnosis , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[24]  Federico Girosi,et al.  On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions , 1996, Neural Computation.

[25]  V A Samaranayake,et al.  Direct Error-Driven Learning for Deep Neural Networks With Applications to Big Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Philip S. Yu,et al.  Deep Learning of Transferable Representation for Scalable Domain Adaptation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[27]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[28]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .