Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes

In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[3]  Wei Xu,et al.  CNN-RNN: A Unified Framework for Multi-label Image Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[5]  Navdeep Jaitly,et al.  Adversarial Autoencoders , 2015, ArXiv.

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  John D. Lafferty,et al.  Semi-supervised learning using randomized mincuts , 2004, ICML.

[8]  Venkatesh Saligrama,et al.  Multi-stage classifier design , 2012, Machine Learning.

[9]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[10]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[11]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[12]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[13]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[14]  Haizhou Li,et al.  Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[15]  Qiang Ji,et al.  Multi-label Learning with Missing Labels , 2014, 2014 22nd International Conference on Pattern Recognition.

[16]  Yu-Chiang Frank Wang,et al.  Learning Deep Latent Spaces for Multi-Label Classification , 2017, ArXiv.

[17]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[18]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[19]  Hsuan-Tien Lin,et al.  Multilabel Classification with Principal Label Space Transformation , 2012, Neural Computation.

[20]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  P. Bennett,et al.  Diabetes mellitus in American (Pima) Indians. , 1971, Lancet.

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  Yang Liu,et al.  Multi-task Adversarial Network for Disentangled Feature Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[25]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[26]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[27]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[28]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[29]  Hsuan-Tien Lin,et al.  Feature-aware Label Space Dimension Reduction for Multi-label Classification , 2012, NIPS.

[30]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[31]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[32]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[33]  Paul Compton,et al.  Inductive knowledge acquisition: a case study , 1987 .

[34]  Wei Liu,et al.  Multi-label Learning with Missing Labels Using Mixed Dependency Graphs , 2018, International Journal of Computer Vision.

[35]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[36]  Mahmood Fathy,et al.  Deep-Cascade: Cascading 3D Deep Neural Networks for Fast Anomaly Detection and Localization in Crowded Scenes , 2017, IEEE Transactions on Image Processing.

[37]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.