In the demonstration we will show a system for searching by similarity and automatically classifying images in a very large dataset. The demonstrated techniques are based on the use of the MI-File (Metric Inverted File) as the access method for executing similarity search efficiently. The MI-File is an access methods based on inverted files that relies on a space transformation that use the notion of perspective to decide about the similarity between two objects. More specifically, if two objects are close one to each other, also the view of the space from their position is similar. Leveraging on this space transformation, it is possible to use inverted file to execute approximate similarity search. In order to test the scalability of this access method, we inserted 106 millions images from the CoPhIR dataset and we created an on-line search engine that allows everybody to search in this dataset. In addition we also used this access methods to perform automatic classification on this very large image dataset. More specifically, we reformulated the classification problem, as resulting from the use of SVM with RBF kernel, as a complex approximate similarity search problem. In such a way, instead of comparing every single image against the classifier, the best images belonging to a class are directly obtained as the result of a complex approximate similarity search query.
[1]
Gonzalo Navarro,et al.
Effective Proximity Retrieval by Ordering Permutations
,
2008,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2]
Gonzalo Navarro,et al.
Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order
,
2005,
MICAI.
[3]
Pasquale Savino,et al.
Approximate similarity search in metric spaces using inverted files
,
2008,
Infoscale.
[4]
Nello Cristianini,et al.
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
,
2000
.
[5]
Τιμή περιγραφέα,et al.
MPEG-7 시각 기술자와 해마 신경망을 이용한 내용기반 검색
,
2005
.
[6]
P. Diaconis.
Group representations in probability and statistics
,
1988
.