论文信息 - Sketch-Specific Data Augmentation for Freehand Sketch Recognition

Sketch-Specific Data Augmentation for Freehand Sketch Recognition

Sketch recognition remains a significant challenge due to the limited training data and the substantial intra-class variance of freehand sketches for the same object. Conventional methods for this task often rely on the availability of the temporal order of sketch strokes, additional cues acquired from different modalities and supervised augmentation of sketch datasets with real images, which also limit the applicability and feasibility of these methods in real scenarios. In this paper, we propose a novel sketch-specific data augmentation (SSDA) method that leverages the quantity and quality of the sketches automatically. From the aspect of quantity, we introduce a Bezier pivot based deformation (BPD) strategy to enrich the training data. Towards quality improvement, we present a mean stroke reconstruction (MSR) approach to generate a set of novel types of sketches with smaller intra-class variances. Both of these solutions are unrestricted from any multi-source data and temporal cues of sketches. Furthermore, we show that some recent deep convolutional neural network models that are trained on generic classes of real images can be better choices than most of the elaborate architectures that are designed explicitly for sketch recognition. As SSDA can be integrated with any convolutional neural networks, it has a distinct advantage over the existing methods. Our extensive experimental evaluations demonstrate that the proposed method achieves state-of-the-art results (84.27%) on the TU-Berlin dataset, outperforming the human performance by a remarkable 11.17% increase. We also present a new benchmark named Sketchy-R to facilitate future research in sketch recognition. Finally, more experiments show the practical value of our approach to the task of sketch-based image retrieval.

[1] Wen Zhou,et al. Training convolutional neural network for sketch recognition on large-scale dataset , 2020, Int. Arab J. Inf. Technol..

[2] Limin Wang,et al. Computer Vision and Image Understanding Bag of Visual Words and Fusion Methods for Action Recognition: Comprehensive Study and Good Practice , 2022 .

[3] Benjamin Graham,et al. Spatially-sparse convolutional neural networks , 2014, ArXiv.

[4] Qi Zou,et al. A Hybrid convolutional neural network for sketch recognition , 2020, Pattern Recognit. Lett..

[5] Marc Alexa,et al. How do humans sketch objects? , 2012, ACM Trans. Graph..

[6] James Hays,et al. The sketchy database , 2016, ACM Trans. Graph..

[7] Shi-Min Hu,et al. Sketch2Photo: internet image montage , 2009, ACM Trans. Graph..

[8] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[9] Xiaochun Cao,et al. SketchNet: Sketch Classification with Web Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Jun Zhang,et al. Adaptive NormalHedge for robust visual tracking , 2015, Signal Process..

[11] Alexei A. Efros,et al. Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[12] Gebräuchliche Fertigarzneimittel,et al. V , 1893, Therapielexikon Neurologie.

[13] Tinne Tuytelaars,et al. Sketch classification and classification-driven analysis using Fisher vectors , 2014, ACM Trans. Graph..

[14] Fang Wang,et al. Sketch-based 3D shape retrieval using Convolutional Neural Networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Xiaochun Cao,et al. Learning Structural Representations via Dynamic Object Landmarks Discovery for Sketch Recognition and Retrieval , 2019, IEEE Transactions on Image Processing.

[16] Jia Deng,et al. A large-scale hierarchical image database , 2009, CVPR 2009.

[17] Feng Liu,et al. Sketch Me That Shoe , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Kun Zhou,et al. Discriminative Sketch‐based 3D Model Retrieval via Robust Shape Matching , 2011, Comput. Graph. Forum.

[19] Ravi Kiran Sarvadevabhatla,et al. Object Category Understanding via Eye Fixations on Freehand Sketches , 2017, IEEE Transactions on Image Processing.

[20] Prabhat,et al. Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[21] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[22] K. Sasaki,et al. Learning to simplify , 2016, ACM Trans. Graph..

[23] Shiguang Shan,et al. Improving Face Sketch Recognition via Adversarial Sketch-Photo Transformation , 2019, 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019).

[24] Ravi Kiran Sarvadevabhatla,et al. SketchParse: Towards Rich Descriptions for Poorly Drawn Sketches using Multi-Task Hierarchical Deep Networks , 2017, ACM Multimedia.

[25] J. Warren,et al. Image deformation using moving least squares , 2006, SIGGRAPH 2006.

[26] Tao Xiang,et al. Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27] Shi-Min Hu,et al. Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models , 2013, ACM Trans. Graph..

[28] Christopher F. Herot. Graphical input through machine recognition of sketches , 1976, SIGGRAPH '76.

[29] Hod Lipson,et al. A freehand sketching interface for progressive construction of 3D objects , 2005, Comput. Graph..

[30] Jie Li,et al. DLFace: Deep local descriptor for cross-modality face recognition , 2019, Pattern Recognit..

[31] Ali Borji,et al. Human vs. Computer in Scene and Object Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32] Liqing Zhang,et al. Query-adaptive shape topic mining for hand-drawn sketch recognition , 2012, ACM Multimedia.

[33] King-Sun Fu,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Tao Xiang,et al. Sketch-a-Net that Beats Humans , 2015, BMVC.

[35] Ravi Kiran Sarvadevabhatla,et al. Enabling My Robot To Play Pictionary: Recurrent Neural Networks For Sketch Recognition , 2016, ACM Multimedia.

[36] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Shaogang Gong,et al. Free-hand sketch recognition by multi-kernel feature learning , 2015, Comput. Vis. Image Underst..

[38] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[39] Kyoung Mu Lee,et al. Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[42] Xinbo Gao,et al. Composite components-based face sketch recognition , 2018, Neurocomputing.

[43] Subhransu Maji,et al. Multi-view Convolutional Neural Networks for 3D Shape Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44] Tao Xiang,et al. Sketch-a-Net: A Deep Neural Network that Beats Humans , 2017, International Journal of Computer Vision.

[45] Leonidas J. Guibas,et al. Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[46] Hongdong Li,et al. Optimizing over Radial Kernels on Compact Manifolds , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[47] P. Alam. ‘L’ , 2021, Composites Engineering: An A–Z Guide.

[48] Tracy Anne Hammond,et al. PaleoSketch: accurate primitive sketch recognition and beautification , 2008, IUI '08.

[49] Tal Hassner,et al. Regressing Robust and Discriminative 3D Morphable Models with a Very Deep Neural Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[51] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[53] Ivan E. Sutherland,et al. Sketchpad a Man-Machine Graphical Communication System , 1899, Outstanding Dissertations in the Computer Sciences.

[54] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[55] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Yuting Zhang,et al. Sketch-Based Image Retrieval by Salient Contour Reinforcement , 2016, IEEE Transactions on Multimedia.

[57] Hod Lipson,et al. A freehand sketching interface for progressive construction of 3D objects , 2005, Comput. Graph..

[58] Shu Wang,et al. Sketch-Based Image Retrieval Through Hypothesis-Driven Object Boundary Selection With HLR Descriptor , 2015, IEEE Transactions on Multimedia.

[59] Bernard Ghanem,et al. Constrained Convolutional Sparse Coding for Parametric Based Reconstruction of Line Drawings , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[60] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[62] Joseph J. Lim,et al. Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[63] Qingming Huang,et al. Hedging Deep Features for Visual Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64] Yue Gao,et al. Continuous Probability Distribution Prediction of Image Emotions via Multitask Shared Sparse Regression , 2017, IEEE Transactions on Multimedia.