Interactive image data labeling using self-organizing maps in an augmented reality scenario

We present an approach for the convenient labeling of image patches gathered from an unrestricted environment. The system is employed for a mobile Augmented Reality (AR) gear: while the user walks around with the head-mounted AR-gear, context-free modules for focus-of-attention permanently sample the most 'interesting' image patches. After this acquisition phase, a Self-Organizing Map (SOM) is trained on the complete set of patches, using combinations of MPEG-7 features as a data representation. The SOM allows visualization of the sampled patches and an easy manual sorting into categories. With very little effort, the user can compose a training set for a classifier, thus, unknown objects can be made known to the system. We evaluate the system for COIL-imagery and demonstrate that a user can reach satisfying categorization within few steps, even for image data sampled from walking in an office environment. (An abbreviated version of some portions of this article appeared in [Bekel, H., Heidemann, G., & Ritter, H. (2005). SOM Based Image Data Structuring in an Augmented Reality Scenario. In Proceedings of the International Joint Conference on Neural Networks, Montreal, Canada.], as part of the IJCNN 2005 conference proceedings, published under the IEEE copyright).

[1]  Erkki Oja,et al.  PicSOM-self-organizing image retrieval with MPEG-7 content descriptors , 2002, IEEE Trans. Neural Networks.

[2]  Thomas Villmann,et al.  Neural maps in remote sensing image analysis , 2003, Neural Networks.

[3]  S. Treue Neural correlates of attention in primate visual cortex , 2001, Trends in Neurosciences.

[4]  Thomas Martinetz,et al.  Topology representing networks , 1994, Neural Networks.

[5]  Gunther Heidemann,et al.  Focus-of-attention from local color symmetries , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[7]  Samuel Kaski,et al.  Bibliography of Self-Organizing Map (SOM) Papers: 1981-1997 , 1998 .

[8]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[9]  Kikuo Fujimura,et al.  Application of self‐organizing maps (SOM) to Auger electron spectroscopy (AES) , 1999 .

[10]  H. Ritter,et al.  Interactive online learning , 2007, Pattern Recognition and Image Analysis.

[11]  Panu Somervuo,et al.  Self-Organizing Maps and Learning Vector Quantization for Feature Sequences , 1999, Neural Processing Letters.

[12]  Juha Vesanto,et al.  SOM-based data visualization methods , 1999, Intell. Data Anal..

[13]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[14]  Bartlett W. Mel SEEMORE: Combining Color, Shape, and Texture Histogramming in a Neurally Inspired Approach to Visual Object Recognition , 1997, Neural Computation.

[15]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[16]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[17]  Chee Sun Won,et al.  Efficient use of local edge histogram descriptor , 2000, MULTIMEDIA '00.

[18]  Gunther Heidemann,et al.  Multimodal interaction in an augmented reality scenario , 2004, ICMI '04.

[19]  J. Koenderink,et al.  The internal representation of solid shape with respect to vision , 1979, Biological Cybernetics.

[20]  T. Kohonen,et al.  Bibliography of Self-Organizing Map SOM) Papers: 1998-2001 Addendum , 2003 .

[21]  Horst M. Eidenberger,et al.  How good are the visual MPEG-7 features? , 2003, Visual Communications and Image Processing.

[22]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[23]  Hermann Ney,et al.  Features for Image Retrieval: A Quantitative Comparison , 2004, DAGM-Symposium.

[24]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[25]  H. Ritter,et al.  SOM based image data structuring in an augmented reality scenario , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[26]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[27]  Thomas Sikora,et al.  The MPEG-7 visual standard for content description-an overview , 2001, IEEE Trans. Circuits Syst. Video Technol..

[28]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .