Learning from Mistakes - A Framework for Neural Architecture Search

Learning from one’s mistakes is an effective human learning technique where the learners focus more on the topics where mistakes were made, so as to deepen their understanding. In this paper, we investigate if this human learning strategy can be applied in machine learning. We propose a novel machine learning method called Learning From Mistakes (LFM), wherein the learner improves its ability to learn by focusing more on the mistakes during revision. We formulate LFM as a three-stage optimization problem: 1) learner learns; 2) learner re-learns focusing on the mistakes, and; 3) learner validates its learning. We develop an efficient algorithm to solve the LFM problem. We apply the LFM framework to neural architecture search on CIFAR-10, CIFAR-100, and Imagenet. Experimental results strongly demonstrate the effectiveness of our model.

[1]  Xiangning Chen,et al.  DrNAS: Dirichlet Neural Architecture Search , 2020, ICLR.

[2]  Bo Zhang,et al.  Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2020, ECCV.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[5]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Chunxiao Liu,et al.  DSNAS: Direct Neural Architecture Search Without Parameter Retraining , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[10]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[11]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[12]  Lihi Zelnik-Manor,et al.  ASAP: Architecture Search, Anneal and Prune , 2019, AISTATS.

[13]  Bo Zhang,et al.  DARTS-: Robustly Stepping out of Performance Collapse Without Indicators , 2020, ArXiv.

[14]  Luis Rademacher,et al.  Efficient Volume Sampling for Row/Column Subset Selection , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[15]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[16]  Shifeng Zhang,et al.  DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[17]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Wei Pan,et al.  BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[19]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[21]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[22]  Jun Wu,et al.  Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild , 2019, International Journal of Computer Vision.

[23]  Qi Tian,et al.  Partially-Connected Neural Architecture Search for Reduced Computational Redundancy , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[25]  Qi Xie,et al.  Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting , 2019, NeurIPS.

[26]  Andreas Krause,et al.  Practical Coreset Constructions for Machine Learning , 2017, 1703.06476.

[27]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[28]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[29]  Jonathan Gordon,et al.  Probabilistic Neural Architecture Search , 2019, ArXiv.

[30]  Weinan Zhang,et al.  DropNAS: Grouped Operation Dropout for Differentiable Architecture Search , 2020, IJCAI.

[31]  Joel Lehman,et al.  Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data , 2019, ICML.

[32]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[33]  Xudong Li,et al.  Noisy Differentiable Architecture Search , 2020, BMVC.

[34]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[35]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[37]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[40]  Yonggang Hu,et al.  MergeNAS: Merge Operations into One for Differentiable Architecture Search , 2020, IJCAI.

[41]  Xiangning Chen,et al.  Stabilizing Differentiable Architecture Search via Perturbation-based Regularization , 2020, ICML.