Accuracy measurement for image retrieval system

During the past decades we have been observing a permanent increase in image data, leading to huge repositories. Content-based image retrieval (CBIR) methods have tried to improve the access to image data. To date, numerous feature extraction methods have been proposed to improve the quality of CBIR and image classification systems. In this paper, we are analyzing the technique of relevance feedback for the purpose of image retrieval system. The survey is used to study all the methods used for image retrieval system. The structure-based features its task as broad as texture image retrieval and/or classification. To develop a structure based feature extraction, we have to investigate CBIR and classification problems. Digital image libraries are becoming more common and widely used as visual information is produced at a rapidly growing rate. Creating and storing digital images is nowadays easy and getting more affordable. As a result, the amount of data in visual form is increasing and there is a strong need for effective ways to manage and process it. We have studied support vector machines to learn the feature space distribution of our structure-based features for several images classes. . CBIR contains three levels namely retrieval by primitive features, retrieval by logical features and retrieval by abstract attributes. It contains the problem of finding images relevant to the users’ information needs from image databases, based principally on low-level visual features for which automatic extraction methods are available. Due to the inherently weak connection between the high-level semantic concepts and the low-level visual features the task of developing this kind of systems is very challenging. A popular method to improve image retrieval performance is to shift from single-round queries to navigational queries. This kind of operation is commonly referred to as relevance feedback and can be considered as supervised learning to adjust the subsequent retrieval process by using information gathered from the user’s feedback. Here we also studied an image indexing method based on a Self-Organizing Maps (SOM). The SOM was interpreted as a combination of clustering and dimensionality reduction. It has the advantage of providing a natural ordering for the clusters due to the preserved topology. This way, the relevance information obtained from the user can be spread to neighboring image clusters. The dimensionality reduction aspect of the algorithm alleviates computational requirements of the algorithm. It definitely contains the feature of novel relevance feedback technique. The relevance feedback technique is based on spreading the user responses to local self organizing maps neighborhoods. With some experiments, it will be confirmed that the efficiency of semantic image retrieval can be substantially increased by using these features in parallel with the standard low-level visual features. The measurements like precision and recall were used to evaluate the performance. Precision-recall graph for the 1.000 and 10,000 image data-set and robustness analysis of the 1.000 image database for different brightness have taken. Keywords---Relevance, Feedback, Precision, Recall, Accuracy, retrieval system. I. RELEVANCE FEEDBACK (RF) RF is a technique for text-based information retrieval to improve the performance of information access systems. The iterative and interactive refinement of the original formulation of a query is known as relevance feedback (RF) in information retrieval. The essence of RF is to move from one shot or batch mode queries to navigational queries, where one query consists of multiple rounds of interaction and the user becomes an inseparable part of the query process. During a round of RF, the user is presented with a list of retrieved items and is expected to evaluate their relevance, which information is then fed back to the retrieval system. The expected effect is that the new query round better represents the need of the user www.ijemr.net ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962 61 Copyright © 2011-15. Vandana Publications. All Rights Reserved. as the query is steered toward the relevant items and away from the non-relevant ones. Three strengths of RF are: (a) It shields the user from the inner details of the retrieval system (b) It brings down the retrieval task to small steps which are easier to grasp and (c) It provides a controlled setting to emphasize some features and de-emphasize others. A. RF IN IMAGE RETRIEVAL RF is popular todays because of a) more ambiguity arises in interpreting images than text, making user interaction more necessary, b) manual modification of the initial query formulation is much more difficult in CBIR than with textual queries. RF can be seen as a form of supervised learning to steer the subsequent query toward the relevant images by using the information gathered from the user’s feedback. RF is to regard a system implementing it as one trying to gradually learn the optimal correspondence between the high-level concepts people use and the low-level features obtained from the images. The user does not need to explicitly specify priorities for different similarity assessments because they are formed implicitly by the system based on the user–system interaction. This is advantageous since the correspondence between concepts and features is temporal and case specific. This means that every image query is different from the others due to the hidden conceptions on the relevance of images and their mutual similarity and therefore using a static image may not be sufficient. On the other hand, the user feedback should be seen, instead of as filtering images based on some preexisting meaning, as a process of creating meaning through the interaction. For implementing RF in a CBIR system, three minimum requirements need to be fulfilled. a) The system must show the user a series of images, remember what images have already been shown, and not display them again. So, the system will not end up in a loop and all images will eventually be displayed b) the user must somehow be able to indicate which images are to some extent relevant to the present query and which are not. Here, these images are denoted as positive and negative seen images. Clearly, this granularity of relevance assessments is only one possibility among others. The relevance scale may also be finer, e.g. containing options like “very relevant”, “relevant”, “somewhat relevant”, and so on. Relevance feedback can also be in the form of direct manipulation of the query structure as with the dynamic visualization methods. c) The system must change its behavior depending on the relevance scores provided for the seen images. During the retrieval process more and more images are assessed and the system has increasing amount of data to use in retrieving the succeeding image sets. Three characteristics of RF, which distinguishes it from many other applications of machine learning, are (a) Small number of training samples (typically Nn < 30). (b) Asymmetry of the training data. (c) RF is used when the user is interacting with the system and thus waiting for the completion of the algorithm. An image query may take several rounds until the results are satisfactory, so fast response time is essential. B. METHODS FROM TEXT-BASED IR a. i) Relevance Feedback with VSMs Here, each database item is represented as a point in K-dim space. Textual documents are commonly represented by the words. This information is encoded into a term-by-document matrix X. Similarly to the database items, the query is also represented as a point or vector. In order to do, the documents are ranked according to their similarity. In text-based retrieval, the standard similarity measure is the cosine measure (5.7). Dimensions are reduced in the preprocessing step by removing the most common terms. In this model, the following methods for query improvement exist. Query point movement. The basic idea is to move the query point toward the part of vector space where the relevant documents are located. This can be used by a formula where qn is the query pt on the n b. round of the query and α, β and γ are weight parameters (α + β + γ = 1) controlling the relative importance of the previous query point, the average of relevant images, and non-relevant images, respectively. The relevant items concentrated on a specific area of the vector space whereas the nonrelevant items are often more heterogeneous. Therefore, we should set the weights so that β > γ. Setting γ = 0 is also possible, resulting in purely positive feedback. Here the assumption is the distance to the query point increases does not generally capture high-level semantics well due to the semantic gap. Feature component re-weighting. The basic idea is to increase the importance of the components of the used feature vectors, which is used to retrieve relevant images. Each component can be given a weight, which is used in calculating the distances between images. This can be easily done by augmenting the used distance measure with component-wise weights: the weight of k component of the m feature is denoted as wmk. This can be used by a formula Assuming that the feature components are independent, the case when the relevant items have similar values for f(k), i.e. the k component of feature f, it can be assumed that f(k) captures something the www.ijemr.net ISSN (ONLINE): 2250-0758, ISSN (PRINT): 2394-6962 62 Copyright © 2011-15. Vandana Publications. All Rights Reserved. relevant items have in common and which corresponds to the user’s information need. where c is a constant, so . ii) RELEVANCE FEEDBACK WITH SOMs a) The user is expected to mark the relevant images as positive, and the unmarked images as negative. As all images in the database have been previously mapped in their best-matching SOM units at the time the SOMs were trained, and easy to locate the positive and negative images on ea

[1]  Patrick Gros,et al.  Content-based Retrieval Using Local Descriptors: Problems and Issues from a Database Perspective , 2001, Pattern Analysis & Applications.

[2]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[3]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[4]  Samuel Moon-Ho Song,et al.  Relevance graph-based image retrieval , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[5]  Matthieu Cord,et al.  Long-term similarity learning in content-based image retrieval , 2002, Proceedings. International Conference on Image Processing.

[6]  B. Reljin,et al.  Adaptive Content-Based Image Retrieval with Relevance Feedback , 2005, EUROCON 2005 - The International Conference on "Computer as a Tool".

[7]  Remco C. Veltkamp,et al.  Content-based image retrieval systems: A survey , 2000 .

[8]  Robert M. Haralick,et al.  Feature normalization and likelihood-based similarity measures for image retrieval , 2001, Pattern Recognit. Lett..

[9]  Ramesh Jain,et al.  Storage and Retrieval for Image and Video Databases III , 1995 .

[10]  Ishwar K. Sethi,et al.  Image retrieval using hierarchical self-organizing feature maps , 1999, Pattern Recognit. Lett..

[11]  Thomas S. Huang,et al.  Content-based image retrieval with relevance feedback in MARS , 1997, Proceedings of International Conference on Image Processing.

[12]  Sami S. Brandt,et al.  Use of Shape Features in Content-Based Image Retrieval , 1999 .

[13]  Alberto Del Bimbo,et al.  Using multiple examples for content-based image retrieval , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[14]  Hans Hinterberger,et al.  Content-Based Image Retrieval in Astronomy , 2000, Information Retrieval.