D-DARTS: Distributed Differentiable Architecture Search

Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods. It drastically reduces search cost by resorting to weight-sharing. However, it also dramatically reduces the search space, thus excluding potential promising architectures. In this article, we propose D-DARTS, a solution that addresses this problem by nesting neural networks at the cell level instead of using weight-sharing to produce more diversified and specialized architectures. Moreover, we introduce a novel algorithm that can derive deeper architectures from a few trained cells, increasing performance and saving computation time. In addition, we also present an alternative search space (DARTOpti) in which we optimize existing handcrafted architectures (e.g., ResNet) rather than starting from scratch. This approach is accompanied by a novel metric that measures the distance between architectures inside our custom search space. Our solution reaches competitive performance on multiple computer vision tasks. Code and pretrained models can be accessed at https://github.com/aheuillet/D-DARTS.

[1]  Kyoung Mu Lee,et al.  Fine-grained neural architecture search for image super-resolution , 2022, Journal of Visual Communication and Image Representation.

[2]  Wanli Ouyang,et al.  $\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Fabien Couthouis,et al.  Collective eXplainable AI: Explaining Cooperative Strategies and Agent Contribution in Multiagent Reinforcement Learning With Shapley Values , 2021, IEEE Computational Intelligence Magazine.

[4]  Xiaojun Chang,et al.  iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients , 2021, ICML.

[5]  Yi Yang,et al.  DOTS: Decoupling Operation and Topology in Differentiable Architecture Search , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Junchi Yan,et al.  DARTS-: Robustly Stepping out of Performance Collapse Without Indicators , 2020, ICLR.

[7]  Elad Eban,et al.  Fine-Grained Stochastic Architecture Search , 2020, ArXiv.

[8]  Xiaojun Chang,et al.  A Comprehensive Survey of Neural Architecture Search , 2020, ACM Comput. Surv..

[9]  Yuandong Tian,et al.  FBNetV2: Differentiable Neural Architecture Search for Spatial and Channel Dimensions , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xiangyu Zhang,et al.  Angle-based Search Space Shrinking for Neural Architecture Search , 2020, ECCV.

[11]  Kaiming He,et al.  Designing Network Design Spaces , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jun Wu,et al.  Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild , 2019, International Journal of Computer Vision.

[13]  A. Yuille,et al.  AtomNAS: Fine-Grained End-to-End Neural Architecture Search , 2019, ICLR.

[14]  Xiangxiang Chu,et al.  Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search , 2019, ECCV.

[15]  Lingxi Xie,et al.  PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search , 2019, ICLR.

[16]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[18]  Mingkui Tan,et al.  NAT: Neural Architecture Transformer for Accurate and Compact Architectures , 2019, NeurIPS.

[19]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[20]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qi Tian,et al.  Progressive Differentiable Architecture Search: Bridging the Depth Gap Between Search and Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Markus H. Gross,et al.  Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Values Approximation , 2019, ICML.

[23]  Tobias Meisen,et al.  Ablation Studies in Artificial Neural Networks , 2019, ArXiv.

[24]  Liang Lin,et al.  SNAS: Stochastic Neural Architecture Search , 2018, ICLR.

[25]  Song Han,et al.  ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.

[26]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[27]  R. Meyes,et al.  Ablation Studies to Uncover Structure of Learned Representations in Artificial Neural Networks , 2019 .

[28]  Tie-Yan Liu,et al.  Neural Architecture Optimization , 2018, NeurIPS.

[29]  Kirthevasan Kandasamy,et al.  Neural Architecture Search with Bayesian Optimisation and Optimal Transport , 2018, NeurIPS.

[30]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[32]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[33]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[34]  Ramesh Raskar,et al.  Designing Neural Network Architectures using Reinforcement Learning , 2016, ICLR.

[35]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[37]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[41]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[42]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[43]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[45]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[46]  Paul Van Dooren,et al.  A MEASURE OF SIMILARITY BETWEEN GRAPH VERTICES . WITH APPLICATIONS TO SYNONYM EXTRACTION AND WEB SEARCHING , 2002 .

[47]  P. Francesco,et al.  Geodesic distance in planar graphs , 2003, cond-mat/0303272.

[48]  L. Shapley A Value for n-person Games , 1988 .

[49]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[50]  Lloyd S. Shapley,et al.  Notes on the n-Person Game — II: The Value of an n-Person Game , 1951 .

[51]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[52]  W. Groß Grundzüge der Mengenlehre , 1915 .