Small-Group Learning, with Application to Neural Architecture Search

Small-group learning is a broadly used methodology in human learning and shows great effectiveness in improving learning outcomes: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called small-group learning (SGL). In our approach, each learner uses its intermediately trained model to generate a pseudo-labeled dataset and re-trains its model using pseudo-labeled datasets generated by other learners. We propose a multi-level optimization framework to formulate SGL which involves three learning stages: learners train their network weights independently; learners train their network weights collaboratively via mutual pseudo-labeling; learners improve their architectures by minimizing validation losses. We develop an efficient algorithm to solve the SGL problem. We apply our approach to neural architecture search and achieve significant improvement on CIFAR-100, CIFAR-10, and ImageNet.

[1]  Yonggang Hu,et al.  MergeNAS: Merge Operations into One for Differentiable Architecture Search , 2020, IJCAI.

[2]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[3]  Jonathan Gordon,et al.  Probabilistic Neural Architecture Search , 2019, ArXiv.

[4]  Bo Zhang,et al.  Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2020, ECCV.

[5]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[6]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[7]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[8]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Shifeng Zhang,et al.  DARTS+: Improved Differentiable Architecture Search with Early Stopping , 2019, ArXiv.

[10]  Yi Yang,et al.  Searching for a Robust Neural Architecture in Four GPU Hours , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Chunxiao Liu,et al.  DSNAS: Direct Neural Architecture Search Without Parameter Retraining , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Wei Pan,et al.  BayesNAS: A Bayesian Approach for Neural Architecture Search , 2019, ICML.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Xiaopeng Zhang,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2020, ICLR.

[15]  Oriol Vinyals,et al.  Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.

[16]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[17]  Alok Aggarwal,et al.  Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.

[18]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[19]  Xiangning Chen,et al.  DrNAS: Dirichlet Neural Architecture Search , 2020, ArXiv.

[20]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Li Fei-Fei,et al.  Progressive Neural Architecture Search , 2017, ECCV.

[22]  Bo Zhang,et al.  DARTS-: Robustly Stepping out of Performance Collapse Without Indicators , 2020, ArXiv.

[23]  Lihi Zelnik-Manor,et al.  ASAP: Architecture Search, Anneal and Prune , 2019, AISTATS.

[24]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Xiangning Chen,et al.  Stabilizing Differentiable Architecture Search via Perturbation-based Regularization , 2020, ICML.

[27]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[28]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[29]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[30]  Weinan Zhang,et al.  DropNAS: Grouped Operation Dropout for Differentiable Architecture Search , 2020, IJCAI.

[31]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Quoc V. Le,et al.  Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.

[33]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).