Classification and Automatic Annotation Extension of Images Using Bayesian Network

In many vision problems, instead of having fully annotated training data, it is easier to obtain just a subset of data with annotations, because it is less restrictive for the user. For this reason, in this paper, we consider especially the problem of classifying weakly-annotated images, where just a small subset of the database is annotated with keywords. In this paper we present and evaluate a new method which improves the effectiveness of content-based image classification, by integrating semantic concepts extracted from text, and by automatically extending annotations to the images with missing keywords. Our model is inspired from the probabilistic graphical model theory: we propose a hierarchical mixture model which enables to handle missing values. Results of visual-textual classification, reported on a database of images collected from the Web, partially and manually annotated, show an improvement by 32.3% in terms of recognition rate against only visual information classification. Besides the automatic annotation extension with our model for images with missing keywords outperforms the visual-textual classification by 6.8%. Finally the proposed method is experimentally competitive with the state-of-art classifiers.

[1]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[2]  Laurent Wendling,et al.  Technical symbols recognition using the two-dimensional Radon transform , 2002, Object recognition supported by user interaction for service robots.

[3]  Judea Pearl,et al.  A Computational Model for Causal and Diagnostic Reasoning in Inference Systems , 1983, IJCAI.

[4]  Leszek Pacholski,et al.  SOFSEM 2001: Theory and Practice of Informatics , 2001, Lecture Notes in Computer Science.

[5]  Jing Hua,et al.  Region-based Image Annotation using Asymmetrical Support Vector Machine-based Multiple-Instance Learning , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  William I. Grosky,et al.  Negotiating the semantic gap: from feature maps to semantic landscapes , 2002, Pattern Recognit..

[7]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[8]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[9]  Jianping Fan,et al.  Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers , 2006, MM '06.

[10]  William I. Grosky,et al.  Negotiating the semantic gap: from feature maps to semantic landscapes , 2001, Pattern Recognit..

[11]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Sven Sandow,et al.  A Decision-Theoretic Motivation for L1-Regularized Maximum Likelihood Modeling , 2005 .

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[15]  Shih-Fu Chang,et al.  Perceptual knowledge construction from annotated image collections , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[16]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.