Fast-At: Fast Automatic Thumbnail Generation Using Deep Neural Networks

Fast-AT is an automatic thumbnail generation system based on deep neural networks. It is a fully-convolutional deep neural network, which learns specific filters for thumbnails of different sizes and aspect ratios. During inference, the appropriate filter is selected depending on the dimensions of the target thumbnail. Unlike most previous work, Fast-AT does not utilize saliency but addresses the problem directly. In addition, it eliminates the need to conduct region search over the saliency map. The model generalizes to thumbnails of different sizes including those with extreme aspect ratios and can generate thumbnails in real time. A data set of more than 70,000 thumbnail annotations was collected to train Fast-AT. We show competitive results in comparison to existing techniques.

[1]  Olga Sorkine-Hornung,et al.  A comparative study of image retargeting , 2010, ACM Trans. Graph..

[2]  David Salesin,et al.  Gaze-based interaction for semi-automatic photo cropping , 2006, CHI.

[3]  Yoichi Sato,et al.  Sensation-based photo cropping , 2009, ACM Multimedia.

[4]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Zhengqin Li,et al.  Automatic Image Cropping: A Computational Complexity Study , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[7]  Benjamin B. Bederson,et al.  Automatic thumbnail cropping and its effectiveness , 2003, UIST '03.

[8]  Stephen Lin,et al.  Automatic Thumbnail Generation Based on Visual Representativeness and Foreground Recognizability , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Stephen Lin,et al.  Learning the Change for Automatic Image Cropping , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[13]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[14]  Haibin Ling,et al.  Scale and Object Aware Image Thumbnailing , 2013, International Journal of Computer Vision.

[15]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[16]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[17]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Fred Stentiford,et al.  Attention Based Auto Image Cropping , 2007, ICVS 2007.

[19]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  David J. Sakrison,et al.  The effects of a visual fidelity criterion of the encoding of images , 1974, IEEE Trans. Inf. Theory.

[24]  Xiaogang Wang,et al.  Content-based photo quality assessment , 2011, 2011 International Conference on Computer Vision.

[25]  Raimondo Schettini,et al.  Self-Adaptive Image Cropping for Small Displays , 2007, IEEE Transactions on Consumer Electronics.