Mining image features for efficient query processing

The number of features required to depict an image can be very large. Using all features simultaneously to measure image similarity and to learn image query-concepts can suffer from the problem of dimensionality curse, which degrades both search accuracy and search speed. Regarding search accuracy, the presence of irrelevant features with respect to a query can contaminate similarity measurement, and hence decrease both the recall and precision of that query. To remedy this problem, we present a mining method that learns online users' query concepts and identifies important features quickly. Regarding search speed, the presence of a large number of features can slow down query-concept learning and indexing performance. We propose a divide-and-conquer method that divides the concept-learning task into G subtasks to achieve speedup. We notice that a task must be divided carefully, or search accuracy may suffer. We thus propose a genetic-based mining algorithm to discover good feature groupings. Through analysis and mining results, we observe that organizing image features in a multi-resolution manner and minimizing intra-group feature correlation, can speed up query-concept learning substantially while maintaining high search accuracy.