Anchoring is a technique for representing objects by their distances to a few well chosen anchors, or vantage points. It can be used in content-based image retrieval for computing image similarity as a function of distances to a fixed set of representative images. Since the number of anchors is usually small, this leads to a reduced dimensionality for similarity searching, enables efficient indexing, and avoids potentially expensive similarity computations in the original feature domain, while guaranteeing lack of false dismissals. Anchoring is therefore surprisingly simple, yet effective, and flavors of it have seen application in speech recognition, audio classification, protein homology detection, and shape matching. In this paper, we describe the anchoring technique in some detail and study its properties, both from an empirical and an analytical standpoint. In particular, we investigate issues in baseline distance selection, anchor selection, and number of anchors. We compare different approaches and evaluate performance of different parameter settings. We also propose two new anchor selection heuristics which may overcome some of the drawbacks of the currently used greedy selection methods.
[1]
Hans-Jörg Schek,et al.
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces
,
1998,
VLDB.
[2]
Remco C. Veltkamp,et al.
Efficient image retrieval through vantage objects
,
1999,
Pattern Recognit..
[3]
Douglas E. Sturim,et al.
Speaker indexing in large audio databases using anchor models
,
2001,
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[4]
Li Liao,et al.
Combining pairwise sequence similarity and support vector machines for remote protein homology detection
,
2002,
RECOMB '02.
[5]
Malcolm Slaney,et al.
Mixtures of probability experts for audio retrieval and indexing
,
2002,
Proceedings. IEEE International Conference on Multimedia and Expo.