Image Representation with Bag of bi-SIFT

Local features are widely adopted to describe visual information in tasks for image registration and nowadays the most used and studied feature is SIFT (Scale Invariant Feature Transform) for the great local description power and the reliability with different acquisition condition. We propose a feature that is based on SIFT features and tends to capture larger image areas that can be used for semantic based task. These features are called bi-SIFT for their resemblance with textual bigrams. We tested the capability of the proposed representation with Corel dataset. In particular we calculated the most representatives features through a clusterization process and used these value according to the "visual terms" paradigm. Experiments on the representation of sets of images with the proposed representation are shown. Although preliminary the results appear to be encouraging.

[1]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[5]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[6]  Mingyue Ding,et al.  Novel remote sensing image registration method based on an improved SIFT descriptor , 2007, International Symposium on Multispectral Image Processing and Pattern Recognition.

[7]  Jonathon S. Hare,et al.  A Linear-Algebraic Technique with an Application in Semantic Image Retrieval , 2006, CIVR.

[8]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[9]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[10]  Samuel Cheng,et al.  Improved sift-based image registration using belief propagation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[12]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Chin-Hui Lee,et al.  Boosting of Maximal Figure of Merit Classifiers for Automatic Image Annotation , 2007, 2007 IEEE International Conference on Image Processing.

[15]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Tony Lindeberg,et al.  Effective Scale: A Natural Unit for Measuring Scale-Space Lifetime , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[20]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.