This work targets image retrieval task hold by MSR-Bing Grand Challenge. Image retrieval is considered as a challenge task because of the gap between low-level image representation and high-level textual query representation. Recently further developed deep neural network sheds light on narrowing the gap by learning high-level image representation from raw pixels. In this paper, we proposed a bag-of-words based deep neural network for image retrieval task, which learns high-level image representation and maps images into bag-of-words space. The DNN model is trained on the large scale clickthrough data, and the relevance between query and image is measured by the cosine similarity of query's bag-of-words representation and image's bag-of-words representation predicted by DNN, the visual similarity of images is computed by high-level image representation extracted via the DNN model too. Finally, PageRank algorithm is used to further improve the ranking list by considering visual similarity of images for each query. The experimental results achieved state-of-the-art performance and verified the effectiveness of our proposed method.
[1]
Jing Wang,et al.
Clickage: towards bridging semantic and intent gaps via mining click logs of search engines
,
2013,
ACM Multimedia.
[2]
Tiejun Zhao,et al.
Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Clickthrough Data
,
2014,
ICLR.
[3]
Nitish Srivastava,et al.
Improving neural networks by preventing co-adaptation of feature detectors
,
2012,
ArXiv.
[4]
Rajeev Motwani,et al.
The PageRank Citation Ranking : Bringing Order to the Web
,
1999,
WWW 1999.
[5]
Yan-Ying Chen,et al.
Search-based relevance association with auxiliary contextual cues
,
2013,
MM '13.