Cross-modal Retrieval by Text and Image Feature Biclustering

We describe our approach to the ImageCLEF-Photo 2007 task. The novelty of our method consists of biclustering image segments and annotation words. Given the query words, we may select the image segment clusters that have strongest cooccurrence with the corresponding word clusters. These image segment clusters act as the selected segments relevant to a query. We rank text hits by our own tf.idf based information retrieval system and image similarities by using a 20-dimensional vector describing the visual content of image segments. Here relevant image segments were selected by the biclustering procedure. Images were segmented by a home developed segmenter. We used neither query expansion nor relevance feedback; queries were generated automatically from the title and the 0.1 weighted description words.

[1]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[2]  Roded Sharan,et al.  Biclustering Algorithms: A Survey , 2007 .

[3]  Paul Clough,et al.  The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems , 2006 .

[4]  Kanad K. Biswas,et al.  Region-based image retrieval using integrated color, shape, and location index , 2004, Comput. Vis. Image Underst..

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Yixin Chen,et al.  Image Categorization by Learning and Reasoning with Regions , 2004, J. Mach. Learn. Res..

[7]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[8]  Allan Hanbury,et al.  Overview of the ImageCLEFphoto 2007 Photographic Retrieval Task , 2008, CLEF.

[9]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Kai Li,et al.  Image similarity search with compact data structures , 2004, CIKM '04.

[11]  Jacques Savoy,et al.  Term Proximity Scoring for Keyword-Based Retrieval Systems , 2003, ECIR.

[12]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[13]  Charles L. A. Clarke,et al.  Term proximity scoring for ad-hoc retrieval on very large text collections , 2006, SIGIR.

[14]  András A. Benczúr,et al.  Searching a Small National Domain - Preliminary Report , 2003, WWW.