Spatially Regularized Latent Topic Model for Simultaneous Object Discovery and Segmentation

Latent Dirichlet Allocation (LDA) has been increasingly applied in the area of computer vision. LDA is based on the 'bag of words' assumption that ignores the spatial structure of images. This problem poses a non-trivial impact on the performance of the model. There exist a number of methods that attempt to address the limit. One representative work can be Spatial Latent Topic Model (Spatial-LTM) for unsupervised joint object discovery and segmentation, which improves over LDA by assigning locally co-occurring visual words with the same topic. However, this model still ignores the spatial relations between visual words which are spatially distant from each other. In this paper, we add a spatial regularization term to the model's posterior distribution that regulates the difference of multinomial weight between each pair of visual words in a topic based on their spatial distance apart in an image set. We call the improved model Spatially Regularized Latent Topic Model (SR-LTM). Experiment result shows that SR-LTM outperforms Spatial-LTM in both unsupervised object discovery accuracy and segmentation accuracy.

[1]  Vitomir Struc,et al.  Gabor-Based Kernel Partial-Least-Squares Discrimination Features for Face Recognition , 2009, Informatica.

[2]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  Ce Liu,et al.  Unsupervised Joint Object Discovery and Segmentation in Internet Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Eric P. Xing,et al.  Conditional Topic Random Fields , 2010, ICML.

[6]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[7]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[8]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[9]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[10]  Michael Brady,et al.  Saliency, Scale and Image Description , 2001, International Journal of Computer Vision.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  T. Minka Estimating a Dirichlet distribution , 2012 .

[14]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[15]  Michal Rosen-Zvi,et al.  Hidden Topic Markov Models , 2007, AISTATS.