Robust Building Identification for Mobile Augmented Reality

Mobile augmented reality applications have received considerable interest in recent years, as camera equipped mobile phones become ubiquitous. We have developed a “Point and Find” application on a cell phone, where a user can point his cell phone at a building on the Stanford campus, and get relevant information of the building on his phone. The problem of recognizing buildings under different lighting conditions, in the presence of occlusion and clutter, still remains a challenging problem. Nister’s Scalable Vocabulary Tree (SVT) [1] approach has received considerable interest for large scale object recognition. The scheme uses heirarchical k-means to create a vocabulary of features or “visual words”. We first show how we can use a SVT and an entropy-based ranking metric to achieve 100% recognition on the well known ZuBuD data set [2]. We present a SVM kernel-based extension to the SVT approach and show that it achieves a 100% recognition rate as well. We discuss the shortcomings of the ZuBuD data set, and present a more challenging Stanford-Nokia data set, with promising results.

[1]  Luc Van Gool,et al.  Fast indexing for image retrieval based on local appearance with re-ranking , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[2]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[5]  Lucas Paletta,et al.  A Mobile Vision System for Urban Detection with Informative Local Descriptors , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[6]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Tsuhan Chen,et al.  DISCOV: A Framework for Discovering Objects in Video , 2008, IEEE Transactions on Multimedia.

[8]  Joo-Hwee Lim,et al.  Scene Identification using Discriminative Patterns , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  Konrad Tollmar,et al.  Searching the Web with mobile images for location recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[12]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Stepán Obdrzálek,et al.  Object recognition methods based on transformation covariant features , 2004, 2004 12th European Signal Processing Conference.