Learning to Learn Cropping Models for Different Aspect Ratio Requirements

Image cropping aims at improving the framing of an image by removing its extraneous outer areas, which is widely used in the photography and printing industry. In some cases, the aspect ratio of cropping results is specified depending on some conditions. In this paper, we propose a meta-learning (learning to learn) based aspect ratio specified image cropping method called Mars, which can generate cropping results of different expected aspect ratios. In the proposed method, a base model and two meta-learners are obtained during the training stage. Given an aspect ratio in the test stage, a new model with new parameters can be generated from the base model. Specifically, the two meta-learners predict the parameters of the base model based on the given aspect ratio. The learning process of the proposed method is learning how to learn cropping models for different aspect ratio requirements, which is a typical meta-learning process. In the experiments, the proposed method is evaluated on three datasets and outperforms most state-of-the-art methods in terms of accuracy and speed. In addition, both the intermediate and final results show that the proposed model can predict different cropping windows for an image depending on different aspect ratio requirements.

[1]  Misha Denil,et al.  Learning to Learn without Gradient Descent by Gradient Descent , 2016, ICML.

[2]  Yoichi Sato,et al.  Sensation-based photo cropping , 2009, ACM Multimedia.

[3]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4]  Haibin Ling,et al.  Scale and Object Aware Image Thumbnailing , 2013, International Journal of Computer Vision.

[5]  Zeb Kurth-Nelson,et al.  Learning to reinforcement learn , 2016, CogSci.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Trevor Darrell,et al.  Learning to Segment Every Thing , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Ran He,et al.  Automatic image cropping with aesthetic map and gradient energy map , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Raimondo Schettini,et al.  Self-Adaptive Image Cropping for Small Displays , 2007, IEEE Transactions on Consumer Electronics.

[10]  Larry S. Davis,et al.  Fast-At: Fast Automatic Thumbnail Generation Using Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[12]  Zhengqin Li,et al.  Automatic Image Cropping: A Computational Complexity Study , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Bogdan Gabrys,et al.  Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.

[14]  Wenguan Wang,et al.  Deep Cropping via Attention Box Prediction and Aesthetics Assessment , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Antoni B. Chan,et al.  Crowd Counting by Adapting Convolutional Neural Networks with Side Information , 2016, ArXiv.

[16]  Shuvozit Ghose,et al.  User Constrained Thumbnail Generation Using Adaptive Convolutions , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Lei Zhang,et al.  Reliable and Efficient Image Cropping: A Grid Anchor Based Approach , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[20]  Tao Mei,et al.  Memory Matching Networks for One-Shot Image Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Radomír Mech,et al.  Good View Hunting: Learning Photo Composition from Dense View Pairs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Benjamin B. Bederson,et al.  Automatic thumbnail cropping and its effectiveness , 2003, UIST '03.

[25]  Bing-Yu Chen,et al.  Quantitative Analysis of Automatic Image Cropping Algorithms: A Dataset and Comparative Study , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Wei Wu,et al.  High Performance Visual Tracking with Siamese Region Proposal Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Fred Stentiford,et al.  Attention Based Auto Image Cropping , 2007, ICVS 2007.

[28]  In-So Kweon,et al.  Modeling photo composition and its application to photo re-arrangement , 2012, 2012 19th IEEE International Conference on Image Processing.

[29]  Kaiqi Huang,et al.  Fast A3RL: Aesthetics-Aware Adversarial Reinforcement Learning for Image Cropping , 2019, IEEE Transactions on Image Processing.

[30]  Stephen Lin,et al.  Automatic Thumbnail Generation Based on Visual Representativeness and Foreground Recognizability , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[31]  Haibin Ling,et al.  A Deep Network Solution for Attention and Aesthetics Aware Photo Cropping , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Radomír Mech,et al.  Automatic Image Cropping using Visual Composition, Boundary Simplicity and Content Preservation Models , 2014, ACM Multimedia.

[33]  Kaiqi Huang,et al.  A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Bingbing Ni,et al.  Learning to photograph , 2010, ACM Multimedia.

[35]  Ryan P. Adams,et al.  Gradient-based Hyperparameter Optimization through Reversible Learning , 2015, ICML.

[36]  Dattaguru V Kamat A framework for visual saliency detection with applications to image thumbnailing , 2009 .

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Tong Yang,et al.  MetaAnchor: Learning to Detect Objects with Customized Anchors , 2018, NeurIPS.

[39]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[40]  Stephen Lin,et al.  Learning the Change for Automatic Image Cropping , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Yao Sun,et al.  Composing Semantic Collage for Image Retargeting , 2018, IEEE Transactions on Image Processing.

[42]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[43]  Kwan-Liu Ma,et al.  Learning to Compose with Professional Photographs on the Web , 2017, ACM Multimedia.