A framework of keyword based image retrieval using proposed Hog_Sift feature extraction method from Twitter Dataset

Abstract The huge amount of user generated content about the real world events are generated by social media for every minute. Twitter has gained tremendous popularity for the past few years with millions of tweets for each day. The twitter data can be monitored through the Twitter Streaming API in order to reveal model, benefit and to analyze user behavior. This is the major advantage in this micro-blogging network which is suitable for data mining. In this paper, self-built Twitter dataset are prepared based on event which are reflected in social media and related images of these event are stored in the image database. One goal of this paper is to discover the keyword from the extracted tweets using pre-processing steps. A new proposed HOG_SIFT features are obtained to extract the features of images related to the keyword detection and this is the second goal of our work. The third goal is to retrieve the significant images from the image database based on subspace clustering techniques such as k-subspace and seq-k-subspace algorithms. It is experimentally found that proposed HOG_SIFT feature extraction method is efficient and gives better performance than SIFT method. Similarly clustering algorithms are compared based on the performance measures such as precision, recall and accuracy. It is proved that seq-k-subspace performance better than the k-subspace clustering algorithm.