论文信息 - A Scaling Law for Synthetic-to-Real Transfer: A Measure of Pre-Training

A Scaling Law for Synthetic-to-Real Transfer: A Measure of Pre-Training

Synthetic-to-real transfer learning is a framework in which we pre-train models with synthetically generated images and ground-truth annotations for real tasks. Although synthetic images overcome the data scarcity issue, it remains unclear how the fine-tune performance scales with pre-trained models, especially in terms of pre-training data size. In this study, we collect a number of empirical observations and uncover the secret. Through experiments, we observe a simple and general scaling law that consistently describes learning curves in various tasks, models, and complexities of synthesized pre-training data. Further, we develop a theory of transfer learning for a simplified scenario and confirm that the derived generalization bound is consistent with our empirical findings.

[1] Jonathan S. Rosenfeld,et al. A Constructive Prediction of the Generalization Error Across Scales , 2020, ICLR.

[2] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.

[3] Taiji Suzuki,et al. Refined Generalization Analysis of Gradient Descent for Over-parameterized Two-layer Neural Networks with Smooth Activations on Classification Problems , 2019, ArXiv.

[4] Leonidas J. Guibas,et al. Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.

[6] Tom Henighan,et al. Scaling Laws for Transfer , 2021, ArXiv.

[7] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.

[8] Alexei A. Efros,et al. What makes ImageNet good for transfer learning? , 2016, ArXiv.

[9] Lucas Beyer,et al. Big Transfer (BiT): General Visual Representation Learning , 2020, ECCV.

[10] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11] Stefan Hinterstoißer,et al. An Annotation Saved is an Annotation Earned: Using Fully Synthetic Training for Object Instance Detection , 2019, ArXiv.

[12] Michael I. Jordan,et al. On the Theory of Transfer Learning: The Importance of Task Diversity , 2020, NeurIPS.

[13] Sanja Fidler,et al. Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation , 2020, ECCV.

[14] Colin Wei,et al. Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin , 2020, ICLR.

[15] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.

[16] George Papandreou,et al. Rethinking Atrous Convolution for Semantic Image Segmentation , 2017, ArXiv.

[17] Matthias Bethge,et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[18] Bolei Zhou,et al. Scene Parsing through ADE20K Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[20] Jared Kaplan,et al. A Neural Scaling Law from the Dimension of the Data Manifold , 2020, ArXiv.

[21] Thomas Mensink,et al. Factors of Influence for Transfer Learning Across Diverse Appearance Domains and Task Types , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Quoc V. Le,et al. Rethinking Pre-training and Self-training , 2020, NeurIPS.

[23] Taiji Suzuki,et al. Fast generalization error bound of deep learning from a kernel perspective , 2018, AISTATS.

[24] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.

[25] Anima Anandkumar,et al. Contrastive Syn-to-Real Generalization , 2021, ICLR.

[26] 俊一甘利. 5分で分かる!? 有名論文ナナメ読み：Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .

[27] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28] Jana Kosecka,et al. Synthesizing Training Data for Object Detection in Indoor Scenes , 2017, Robotics: Science and Systems.

[29] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Takeo Kanade,et al. How Useful Is Photo-Realistic Rendering for Visual Learning? , 2016, ECCV Workshops.

[31] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[32] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33] Yue Wang,et al. Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? , 2020, ECCV.

[34] Yong Jae Lee,et al. YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Mehdi Mousavi,et al. AI Playground: Unreal Engine-based Data Ablation Tool for Deep Learning , 2020, ISVC.

[36] Atsushi Nitanda,et al. Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime , 2021, ICLR.

[37] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[38] M. Bethge,et al. Shortcut learning in deep neural networks , 2020, Nature Machine Intelligence.

[39] Shun-ichi Amari,et al. Four Types of Learning Curves , 1992, Neural Computation.

[40] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[41] Vibhav Vineet,et al. Photorealistic Image Synthesis for Object Instance Detection , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[42] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.

[44] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[45] Jia Deng,et al. How Useful Is Self-Supervised Pretraining for Visual Tasks? , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).