Integrating Multiple Classifiers In Visual Object Detectors Learned From User Input

There have been many recent efforts in contentbased retrieval to perform automatic classification of images/visual objects. Most approaches, however, have focused on using individual classifiers. In this paper, we study the way in which, in a dynamic framework, multiple classifiers can be combined when applying Visual Object Detectors. We propose a hybrid classifier combination approach, in which decisions of individual classifiers are combined in the following three ways: (1) classifier fusion, (2) classifier cooperation, and (3) hierarchical combination. In earlier work, we presented the Visual Apprentice framework, in which a user defines visual object models via a multiple-level object-definition hierarchy (region, perceptual-area, object part, and object). As the user provides examples from images or videos, visual features are extracted and multiple classifiers are learned for each node of the hierarchy. In this paper, we discuss the benefits of hybrid classifier combination in the Visual Apprentice framework, and show some experimental results in classifier fusion. These results suggest possible improvements in classification accuracy, particularly of detectors reported earlier for Baseball video, images with skies, and images with handshakes.

[1]  David A. Forsyth,et al.  Body plans , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Charles A. Bouman,et al.  Storage and Retrieval for Image and Video Databases VII , 1998 .

[3]  Michael J. Pazzani,et al.  Hydra-mm: Learning Multiple Descriptions to Improve Classification Accuracy , 1995, Int. J. Artif. Intell. Tools.

[4]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[5]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[6]  Anil K. Jain,et al.  On image classification: city vs. landscape , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[7]  Chuanyi Ji,et al.  Combinations of Weak Classifiers , 1996, NIPS.

[8]  Ron Kohavi,et al.  MLC++: a machine learning library in C++ , 1994, Proceedings Sixth International Conference on Tools with Artificial Intelligence. TAI 94.

[9]  Michael J. Pazzani,et al.  Classification Using Bayes Averaging of Multiple, Relational Rule-based Models , 1995, AISTATS.

[10]  Kevin W. Bowyer,et al.  Combination of multiple classifiers using local accuracy estimates , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Thomas P. Minka,et al.  An image database browser that learns from user interaction , 1996 .

[12]  Shih-Fu Chang,et al.  Visual information retrieval from large distributed online repositories , 1997, CACM.

[13]  Shih-Fu Chang,et al.  Model-based classification of visual information for content-based retrieval , 1998, Electronic Imaging.

[14]  Shih-Fu Chang,et al.  Automatic selection of visual features and classifiers , 1999, Electronic Imaging.

[15]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..