论文信息 - Efficient On-the-fly Category Retrieval Using ConvNets and GPUs

Efficient On-the-fly Category Retrieval Using ConvNets and GPUs

We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval – where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets.

[1] Bart Thomee,et al. New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[2] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[3] Svetlana Lazebnik,et al. Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[4] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Andrew Zisserman,et al. Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[8] Ivor W. Tsang,et al. Using large-scale web data to facilitate textual query based retrieval of consumer photos , 2009, MM '09.

[9] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[10] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[11] Andrew Zisserman,et al. The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[12] Andrew Zisserman,et al. VISOR: Towards On-the-Fly Large-Scale Object Category Retrieval , 2012, ACCV.

[13] Tinne Tuytelaars,et al. Mining Multiple Queries for Image Retrieval: On-the-Fly Learning of an Object-Specific Mid-level Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[14] Hervé Jégou,et al. Anti-sparse coding for approximate nearest neighbor search , 2011, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[17] Cordelia Schmid,et al. Good Practice in Large-Scale Learning for Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[20] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[21] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[22] Florent Perronnin,et al. Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[23] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[24] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[26] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27] Andrew Zisserman,et al. A Compact and Discriminative Face Track Descriptor , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Florent Perronnin,et al. High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[29] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[30] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31] Andrew Zisserman,et al. On-the-fly specific person retrieval , 2012, 2012 13th International Workshop on Image Analysis for Multimedia Interactive Services.

[32] Andrew Zisserman,et al. All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[34] Mark J. Huiskes,et al. The MIR flickr retrieval evaluation , 2008, MIR '08.

[35] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[36] Andrew Zisserman,et al. Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.