Ontological Random Forests for Image Classification

Previous image classification approaches mostly neglect semantics, which has two major limitations. First, categories are simply treated independently while in fact they have semantic overlaps. For example, "sedan" is a specific kind of "car". Therefore, it's unreasonable to train a classifier to distinguish between "sedan" and "car". Second, image feature representations used for classifying different categories are the same. However, the human perception system is believed to use different features for different objects. In this paper, we leverage semantic ontologies to solve the aforementioned problems. The authors propose an ontological random forest algorithm where the splitting of decision trees are determined by semantic relations among categories. Then hierarchical features are automatically learned by multiple-instance learning to capture visual dissimilarities at different concept levels. Their approach is tested on two image classification datasets. Experimental results demonstrate that their approach not only outperforms state-of-the-art results but also identifies semantic visual features.

[1]  Samy Bengio,et al.  Large-Scale Object Classification Using Label Relation Graphs , 2014, ECCV.

[2]  Charles A. Collin,et al.  Subordinate-level categorization relies on high spatial frequencies to a greater degree than basic-level categorization , 2005, Perception & psychophysics.

[3]  Hanzi Wang,et al.  Scene text recognition using sparse coding based features , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[4]  Ohad Shamir,et al.  Probabilistic Label Trees for Efficient Large Scale Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[6]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Ming Shao,et al.  Learning relative features through adaptive pooling for image classification , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[8]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[9]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[12]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[13]  Jonathan Krause,et al.  Fine-Grained Crowdsourcing for Fine-Grained Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Thomas Blaschke,et al.  Ontology-Based Classification of Building Types Detected from Airborne Laser Scanning Data , 2014, Remote. Sens..

[15]  Gary R. Bradski,et al.  A codebook-free and annotation-free approach for fine-grained image categorization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Jianping Fan,et al.  Cost-sensitive learning of hierarchical tree classifiers for large-scale image classification and novel category detection , 2015, Pattern Recognit..

[17]  Jiangping Wang,et al.  An ontological bagging approach for image classification of crowdsourced data , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[18]  Pietro Perona,et al.  Learning and using taxonomies for fast visual categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21]  Kristen Grauman,et al.  Learning a Tree of Metrics with Disjoint Visual Features , 2011, NIPS.

[22]  Gordon W. Paynter,et al.  An Evaluation of Document Keyphrase Sets , 2003, J. Digit. Inf..

[23]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[24]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[25]  Thomas S. Huang,et al.  Hierarchical image feature extraction and classification , 2010, ACM Multimedia.

[26]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[28]  Larry S. Davis,et al.  Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance , 2011, 2011 International Conference on Computer Vision.

[29]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Zhuowen Tu,et al.  Max-Margin Multiple-Instance Dictionary Learning , 2013, ICML.

[31]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[32]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[33]  Frehiwot Fisseha,et al.  Reengineering Thesauri for New Applications: The AGROVOC Example , 2006, J. Digit. Inf..