Semantics Discovery for Image Indexing

To bridge the gap between low-level features and high-level semantic queries in image retrieval, detecting meaningful visual entities (e.g. faces, sky, foliage, buildings etc) based on trained pattern classifiers has become an active research trend. However, a drawback of the supervised learning approach is the human effort to provide labeled regions as training samples. In this paper, we propose a new three-stage hybrid framework to discover local semantic patterns and generate their samples for training with minimal human intervention. Support vector machines (SVM) are first trained on local image blocks from a small number of images labeled as several semantic categories. Then to bootstrap the local semantics, image blocks that produce high SVM outputs are grouped into Discovered Semantic Regions (DSRs) using fuzzy c-means clustering. The training samples for these DSRs are automatically induced from cluster memberships and subject to support vector machine learning to form local semantic detectors for DSRs. An image is then indexed as a tessellation of DSR histograms and matched using histogram intersection. We evaluate our method against the linear fusion of color and texture features using 16 semantic queries on 2400 heterogeneous consumer photos. The DSR models achieved a promising 26% improvement in average precision over that of the feature fusion approach.

[1]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[2]  Mads Nielsen,et al.  Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[3]  B. Frey,et al.  Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Andrea Kutics,et al.  Linking images and keywords for semantics-based image retrieval , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[5]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[7]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[8]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[10]  Lei Wang,et al.  Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[12]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[13]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[14]  John R. Smith,et al.  A framework for moderate vocabulary semantic visual concept detection , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[15]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[16]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Christopher M. Bishop,et al.  Non-linear Bayesian Image Modelling , 2000, ECCV.

[18]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[19]  Carla E. Brodley,et al.  Content-Based Retrieval from Medical Image Databases: A Synergy of Human Interaction, Machine Learning and Computer Vision , 1999, AAAI/IAAI.

[20]  Aleksandra Mojsilovic,et al.  Semantic based categorization, browsing and retrieval in medical image databases , 2002, Proceedings. International Conference on Image Processing.

[21]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  Qi Tian,et al.  Discriminant-EM algorithm with application to image retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[23]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[24]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Thomas S. Huang,et al.  Factor graph framework for semantic video indexing , 2002, IEEE Trans. Circuits Syst. Video Technol..

[27]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[28]  Andrea Salgian,et al.  Minimally supervised acquisition of 3D recognition models from cluttered images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.