Active query sensing for mobile location search

While much exciting progress is being made in mobile visual search, one important question has been left unexplored in all current systems. When the first query fails to find the right target (up to 50% likelihood), how should the user form his/her search strategy in the subsequent interaction? In this paper, we propose a novel Active Query Sensing system to suggest the best way for sensing the surrounding scenes while forming the second query for location search. We accomplish the goal by developing several unique components -- an offline process for analyzing the saliency of the views associated with each geographical location based on score distribution modeling, predicting the visual search precision of individual views and locations, estimating the view of an unseen query, and suggesting the best subsequent view change. Using a scalable visual search system implemented over a NYC street view data set (0.3 million images), we show a performance gain as high as two folds, reducing the failure rate of mobile location search to only 12% after the second query. This work may open up an exciting new direction for developing interactive mobile media applications through innovative exploitation of active sensing and query formulation.

[1]  A. Torralba,et al.  Matching and Predicting Street Level Images , 2010 .

[2]  Tom Drummond,et al.  Unified Loop Closing and Recovery for Real Time Monocular SLAM , 2008, BMVC.

[3]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[7]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Kui-Lam Kwok,et al.  TREC 2004 Robust Track Experiments Using PIRCS , 2004, TREC.

[9]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[10]  Wei Zhang,et al.  Image Based Localization in Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[11]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[13]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Tomás Pajdla,et al.  Avoiding Confusing Features in Place Recognition , 2010, ECCV.

[16]  Konrad Tollmar,et al.  Searching the Web with mobile images for location recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Iadh Ounis,et al.  Inferring Query Performance Using Pre-retrieval Predictors , 2004, SPIRE.

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[20]  Xin Chen,et al.  City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[21]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[22]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[23]  Marc Pollefeys,et al.  Handling Urban Location Recognition as a 2D Homothetic Problem , 2010, ECCV.

[24]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.