A Mobile Vision System for Urban Detection with Informative Local Descriptors

We present a computer vision system for the detection and identification of urban objects from mobile phone imagery, e.g., for the application of tourist information services. Recognition is based on MAP decision making over weak object hypotheses from local descriptor responses in the mobile imagery. We present an improvement over the standard SIFT key detector [7] by selecting only informative (i-SIFT) keys for descriptor matching. Selection is applied first to reduce the complexity of the object model and second to accelerate detection by selective filtering. We present results on the MPG-20 mobile phone imagery with severe illumination, scale and viewpoint changes in the images, performing with ≈ 98% accuracy in identification, efficient (100%) background rejection, efficient (0%) false alarm rate, and reliable quality of service under extreme illumination conditions, significantly improving standard SIFT based recognition in every sense, providing - important for mobile vision - runtimes which are ≈ 8 (≈24) times faster for the MPG-20 (ZuBuD) database.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Roberto Cipolla,et al.  Modelling and Interpretation of Architecture from Several Images , 2004, International Journal of Computer Vision.

[3]  Luc Van Gool,et al.  HPAT Indexing for Fast Object/Scene Recognition Based on Local Appearance , 2003, CIVR.

[4]  Horst Bischof,et al.  Object recognition using local information content , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Stepán Obdrzálek,et al.  Object Recognition using Local Affine Frames on Distinguished Regions , 2002, BMVC.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[8]  B. Hofmann-Wellenhof,et al.  Global Positioning System , 1992 .

[9]  Yi Li,et al.  Consistent line clusters for building recognition in CBIR , 2002, Object recognition supported by user interaction for service robots.

[10]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[11]  Konrad Tollmar,et al.  Searching the Web with mobile images for location recognition , 2004, CVPR 2004.

[12]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Cordelia Schmid,et al.  Selection of scale-invariant parts for object class recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Shimon Ullman,et al.  Object recognition with informative features and linear classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.