Novel Dataset for Fine-Grained Image Categorization : Stanford Dogs

We introduce a 120 class Stanford Dogs dataset, a challenging and large-scale dataset aimed at fine-grained image categorization. Stanford Dogs includes over 22,000 annotated images of dogs belonging to 120 species. Each image is annotated with a bounding box and object class label. Fig. 1 shows examples of images from Stanford Dogs. This dataset is extremely challenging due to a variety of reasons. First, being a fine-grained categorization problem, there is little inter-class variation. For example the basset hound and bloodhound share very similar facial characteristics but differ significantly in their color, while the Japanese spaniel and papillion share very similar color but greatly differ in their facial characteristics. Second, there is very large intra-class variation. The images show that dogs within a class could have different ages (e.g. beagle), poses (e.g. blenheim spaniel), occlusion/self-occlusion and even color (e.g. Shih-tzu). Furthermore, compared to other animal datasets that tend to exist in natural scenes, a large proportion of the images contain humans and are taken in manmade environments leading to greater background variation. The aforementioned reasons make this an extremely challenging dataset.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[3]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[4]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Fei-Fei Li,et al.  Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[7]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.