Few-Sample and Adversarial Representation Learning for Continual Stream Mining

Deep Neural Networks (DNNs) have primarily been demonstrated to be useful for closed-world classification problems where the number of categories is fixed. However, DNNs notoriously fail when tasked with label prediction in a non-stationary data stream scenario, which has the continuous emergence of the unknown or novel class (categories not in the training set). For example, new topics continually emerge in social media or e-commerce. To solve this challenge, a DNN should not only be able to detect the novel class effectively but also incrementally learn new concepts from limited samples over time. Literature that addresses both problems simultaneously is limited. In this paper, we focus on improving the generalization of the model on the novel classes, and making the model continually learn from only a few samples from the novel categories. Different from existing approaches that rely on abundant labeled instances to re-train/update the model, we propose a new approach based on Few Sample and Adversarial Representation Learning (FSAR). The key novelty is that we introduce the adversarial confusion term into both the representation learning and few-sample learning process, which reduces the over-confidence of the model on the seen classes, further enhance the generalization of the model to detect and learn new categories with only a few samples. We train the FSAR operated in two stages: first, FSAR learns an intra-class compacted and inter-class separated feature embedding to detect the novel classes; next, we collect a few labeled samples belong to the new categories, utilize episode-training to exploit the intrinsic features for few-sample learning. We evaluated FSAR on different datasets, using extensive experimental results from various simulated stream benchmarks to show that FSAR effectively outperforms current state-of-the-art approaches.

[1]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2]  Matthew A. Brown,et al.  Low-Shot Learning with Imprinted Weights , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[4]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  Latifur Khan,et al.  SAND: Semi-Supervised Adaptive Novel Class Detection and Classification over Data Stream , 2016, AAAI.

[8]  Zhi-Hua Zhou,et al.  Streaming Classification with Emerging New Class by Class Matrix Sketching , 2017, AAAI.

[9]  R. Srikant,et al.  Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks , 2017, ICLR.

[10]  Latifur Khan,et al.  Detecting and Tracking Concept Class Drift and Emergence in Non-Stationary Fast Data Streams , 2015, AAAI.

[11]  Lei Shu,et al.  DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[12]  Stanislav Pidhorskyi,et al.  Generative Probabilistic Novelty Detection with Adversarial Autoencoders , 2018, NeurIPS.

[13]  Kevin Gimpel,et al.  A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks , 2016, ICLR.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[17]  Jian Wang,et al.  Deep Metric Learning with Angular Loss , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Renjie Liao,et al.  Incremental Few-Shot Learning with Attention Attractor Networks , 2018, NeurIPS.

[19]  Xia Zhu,et al.  Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers , 2018, ECCV.

[20]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[21]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[23]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[24]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[25]  Ping Tan,et al.  DualGAN: Unsupervised Dual Learning for Image-to-Image Translation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Taesup Moon,et al.  Uncertainty-based Continual Learning with Adaptive Regularization , 2019, NeurIPS.

[27]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[28]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[29]  Eric Horvitz,et al.  Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration , 2016, AAAI.

[30]  Abhinav Gupta,et al.  A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Latifur Khan,et al.  Robust High Dimensional Stream Classification with Novel Class Detection , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[32]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[33]  Charu C. Aggarwal,et al.  Ensemble Direct Density Ratio Estimation for Multistream Classification , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[34]  Jianbo Shi,et al.  Adversarial Structure Matching for Structured Prediction Tasks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Bhavani M. Thuraisingham,et al.  Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[36]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Jie Liu,et al.  A Framework for Multistream Regression With Direct Density Ratio Estimation , 2018, AAAI.

[38]  Vishal M. Patel,et al.  Sparse Representation-Based Open Set Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Weiping Ding,et al.  Automatic Construction of Multi-layer Perceptron Network from Streaming Examples , 2019, CIKM.

[40]  Horst Possegger,et al.  BIER — Boosting Independent Embeddings Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[42]  Charu C. Aggarwal,et al.  Efficient handling of concept drift and concept evolution over Stream Data , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[43]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[44]  John H. L. Hansen,et al.  Cross-lingual Text-independent Speaker Verification Using Unsupervised Adversarial Discriminative Domain Adaptation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[47]  Nanning Zheng,et al.  Person Re-identification by Multi-Channel Parts-Based CNN with Improved Triplet Loss Function , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[49]  Philip S. Yu,et al.  Open-world Learning and Application to Product Classification , 2018, WWW.

[50]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[51]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[52]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[53]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Graham W. Taylor,et al.  Learning Confidence for Out-of-Distribution Detection in Neural Networks , 2018, ArXiv.

[55]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Terrance E. Boult,et al.  Towards Open Set Deep Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Deborah Estrin,et al.  Collaborative Metric Learning , 2017, WWW.