Fine-grained bird recognition by using contour-based pose transfer

Abstract. We propose a pose transfer method for fine-grained classifications of birds that have wide variations in appearance due to different poses and subcategories. Specifically, bird pose is transferred by using Radon-transform-based contour descriptor, k-means clustering, and K nearest neighbors (KNN) classifier. During training, we clustered annotated image samples into certain poses based on their normalized part locations and used the cluster centers as their consistent part constellations for a particular pose. At the testing stage, Radon-transform-based contour descriptor is used to find the pose a sample belongs to with a KNN classifier by using cosine similarity, and normalized part constellations are transferred to the unannotated image according to the pose type. Bag-of-visual words with OpponentSIFT and color names extracted from each part and from the global image are concatenated as feature vector, which is input to support vector machine for classification. Experimental results demonstrate significant performance gains from our method on the Caltech-UCSD Birds-2011 dataset for the fine-grained bird classification task.

[1]  Arnold W. M. Smeulders,et al.  Fine-Grained Categorization by Alignments , 2013, 2013 IEEE International Conference on Computer Vision.

[2]  Peter N. Belhumeur,et al.  Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  C. V. Jawahar,et al.  Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Forrest N. Iandola,et al.  Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  David W. Jacobs,et al.  Dog Breed Classification Using Part Localization , 2012, ECCV.

[6]  Peter N. Belhumeur,et al.  POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Joachim Denzler,et al.  Nonparametric Part Transfer for Fine-Grained Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[10]  Trevor Darrell,et al.  Pose pooling kernels for sub-category recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Andrew Zisserman,et al.  Efficient Additive Kernels via Explicit Feature Maps , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Frank Kurth,et al.  Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring , 2010, Pattern Recognit. Lett..

[13]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[14]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[15]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[16]  Antonio Torralba,et al.  Nonparametric Scene Parsing via Label Transfer , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.