A Comparison of Human and Machine Assessments of Image Similarity for the Organization of Image Databases

There has recently been a significant interest in the organization and content-based querying of large images databases. Most frequently, the underlying hypothesis is that image similarity can be characterized by low-level image features, without further abstraction. This assumes that there is sufficient agreement between machine and human measures of image similarity for the database to be useful. We wish to assess the veracity of this assumption. To this end, we develop measures of the agreement between two partitionings of an image set; we show that it is vital to take chance agreements into account. We then use these measures to assess the agreement between human subjects and a variety of machine clustering techniques on a set of images. The results can be used to select and refine image distance measures for querying and organizing image databases.

[1]  Arnold W. M. Smeulders,et al.  Image Databases and Multi-Media Search , 1998, Image Databases and Multi-Media Search.

[2]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[3]  Sudeep Sarkar,et al.  Comparison of edge detectors: a methodology and initial study , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  J J Bartko,et al.  ON THE METHODS AND THEORY OF RELIABILITY , 1976, The Journal of nervous and mental disease.

[5]  Sung-Hyon Myaeng,et al.  Image organization and retrieval with automatically constructed feature vectors , 1996, SIGIR '96.

[6]  Toshikazu Kato,et al.  Database architecture for content-based image retrieval , 1992, Electronic Imaging.

[7]  G M Raab,et al.  Design and Analysis of Reliability Studies—the Statistical Evaluation of Measurement Errors , 1991 .

[8]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Electronic Imaging.

[9]  Josef Kittler,et al.  Efficient and Robust Retrieval by Shape Content through Curvature Scale Space , 1998, Image Databases and Multi-Media Search.

[10]  Thierry Pun,et al.  Statistical structuring of pictorial databases for content-based image retrieval systems , 1996, Pattern Recognit. Lett..

[11]  Anil K. Jain,et al.  Image retrieval using color and shape , 1996, Pattern Recognit..

[12]  Thierry Pun,et al.  Correspondence analysis and hierarchical indexing for content-based image retrieval , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.