Representative Batch Normalization with Feature Calibration

Batch Normalization (BatchNorm) has become the default component in modern neural networks to stabilize training. In BatchNorm, centering and scaling operations, along with mean and variance statistics, are utilized for feature standardization over the batch dimension. The batch dependency of BatchNorm enables stable training and better representation of the network, while inevitably ignores the representation differences among instances. We propose to add a simple yet effective feature calibration scheme into the centering and scaling operations of BatchNorm, enhancing the instance-specific representations with the negligible computational cost. The centering calibration strengthens informative features and reduces noisy features. The scaling calibration restricts the feature intensity to form a more stable feature distribution. Our proposed variant of BatchNorm, namely Representative BatchNorm, can be plugged into existing methods to boost the performance of various tasks such as classification, detection, and segmentation. The source code is available in http://mmcheng.net/rbn.

[1]  Lorenzo Porzi,et al.  In-place Activated BatchNorm for Memory-Optimized Training of DNNs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Ruimao Zhang,et al.  Switchable Normalization for Learning-to-Normalize Deep Representation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[5]  Liang Lin,et al.  Kalman Normalization: Normalizing Internal Representations Across Network Layers , 2018, NeurIPS.

[6]  Michael I. Jordan,et al.  Transferable Normalization: Towards Improving Transferability of Deep Neural Networks , 2019, NeurIPS.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[9]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Aaron C. Courville,et al.  Learning Visual Reasoning Without Strong Priors , 2017, ICML 2017.

[11]  Wanling Gao,et al.  Extended Batch Normalization , 2020, ArXiv.

[12]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[13]  Ruimao Zhang,et al.  SSN: Learning Sparse Switchable Normalization via SparsestMax , 2019, International Journal of Computer Vision.

[14]  Kai Zhao,et al.  Res2Net: A New Multi-Scale Backbone Architecture , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Quoc V. Le,et al.  Evolving Normalization-Activation Layers , 2020, NeurIPS.

[16]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Sebastian Nowozin,et al.  TaskNorm: Rethinking Batch Normalization for Meta-Learning , 2020, ICML.

[18]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Wei Sun,et al.  Image Synthesis From Reconfigurable Layout and Style , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Chenxi Liu,et al.  Rethinking Normalization and Elimination Singularity in Neural Networks , 2019, ArXiv.

[22]  Bo Ren,et al.  Supplementary Materials for: VecRoad: Point-based Iterative Graph Exploration for Road Graphs Extraction , 2020 .

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Abhinav Shrivastava,et al.  EvalNorm: Estimating Batch Normalization Statistics for Evaluation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Shankar Krishnan,et al.  Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Hugo Larochelle,et al.  Modulating early visual processing by language , 2017, NIPS.

[27]  Elad Hoffer,et al.  Norm matters: efficient and accurate normalization schemes in deep networks , 2018, NeurIPS.

[28]  Hakan Bilen,et al.  Mode Normalization , 2018, ICLR.

[29]  A. Yuille,et al.  Cross-Iteration Batch Normalization , 2019, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jascha Sohl-Dickstein,et al.  A Mean Field Theory of Batch Normalization , 2019, ICLR.

[31]  Ruimao Zhang,et al.  Differentiable Dynamic Normalization for Learning Deep Representation , 2019, ICML.

[32]  Peter Wonka,et al.  SEAN: Image Synthesis With Semantic Region-Adaptive Normalization , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[34]  Shi-Min Hu,et al.  Jittor: a novel deep learning framework with meta-operators and unified graph execution , 2020, Science China Information Sciences.

[35]  Ming-Ming Cheng,et al.  Global2Local: Efficient Structure Search for Video Action Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Mubarak Shah,et al.  Training Faster by Separating Modes of Variation in Batch-Normalized Models , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Hwann-Tzong Chen,et al.  Instance-Level Meta Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Taesung Park,et al.  Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shuicheng Yan,et al.  Highly Efficient Salient Object Detection with 100K Parameters , 2020, ECCV.

[40]  Lei Huang,et al.  Iterative Normalization: Beyond Standardization Towards Efficient Whitening , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Xiaolin Li,et al.  Generalized Batch Normalization: Towards Accelerating Deep Neural Networks , 2018, AAAI.

[42]  Jian Sun,et al.  Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization , 2020, ICLR.

[43]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Lei Huang,et al.  Normalization Techniques in Training DNNs: Methodology, Analysis and Application , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Michael James,et al.  Online Normalization for Training Neural Networks , 2019, NeurIPS.

[46]  Yi Yang,et al.  DOTS: Decoupling Operation and Topology in Differentiable Architecture Search , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[48]  Pascal Vincent,et al.  An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation , 2019, ArXiv.

[49]  Aaron C. Courville,et al.  Adversarially Learned Inference , 2016, ICLR.

[50]  Aaron C. Courville,et al.  Recurrent Batch Normalization , 2016, ICLR.

[51]  Kaiming He,et al.  Panoptic Feature Pyramid Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Senwei Liang,et al.  Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise , 2019, AAAI.

[53]  Shilei Wen,et al.  Dynamic Instance Normalization for Arbitrary Style Transfer , 2019, AAAI.

[54]  Ping Luo,et al.  Differentiable Learning-to-Normalize via Switchable Normalization , 2018, ICLR.

[55]  Bohyung Han,et al.  Domain-Specific Batch Normalization for Unsupervised Domain Adaptation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Luping Shi,et al.  L 1-Norm Batch Normalization for Efficient Training of Deep Neural Networks , 2018 .

[57]  Kai Zhao,et al.  Deep Hough Transform for Semantic Line Detection , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Lu Yuan,et al.  Rethinking Spatially-Adaptive Normalization , 2020, ArXiv.

[59]  Takeru Miyato,et al.  cGANs with Projection Discriminator , 2018, ICLR.

[60]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[61]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[62]  Renjie Liao,et al.  Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes , 2016, ICLR.

[63]  Christopher Kiekintveld,et al.  Local Context Normalization: Revisiting Local Normalization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Tianfu Wu,et al.  Attentive Normalization , 2020, ECCV.

[65]  Ognjen Arandjelovic,et al.  A New Look at Ghost Normalization , 2020, ArXiv.

[66]  Michael J. Dinneen,et al.  Four Things Everyone Should Know to Improve Batch Normalization , 2019, ICLR.

[67]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[68]  Yuning Jiang,et al.  MegDet: A Large Mini-Batch Object Detector , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[69]  Yuan Xie,et al.  $L1$ -Norm Batch Normalization for Efficient Training of Deep Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.