Robust latent poisson deconvolution from multiple imperfect features for web topic detection

In web topic detection, detecting “hot” topics from enormous User-Generated Content (UGC) on web data poses two main difficulties that conventional approaches can barely handle: 1) poor feature representations from noisy images and short texts; and 2) uncertain roles of modalities where visual content is either highly or weakly relevant to textual cues due to less-constrained data. In this paper, following the detection by ranking approach, we address the problem by learning a robust shared representation from multiple, noisy and complementary features, and integrating both textual and visual graphs into a k-Nearest Neighbor Similarity Graph (k-N2SG). Then Non-negative Matrix Factorization using Random walk (NMFR) is introduced to generate topic candidates. An efficient fusion of multiple graphs is then done by a Latent Poisson Deconvolution (LPD) which consists of a poisson deconvolution with sparse basis similarities for each edge. Experiments show significantly improved accuracy of the proposed approach in comparison with the state-of-the-art methods on two public data sets.

[1]  Zhi-Quan Luo,et al.  On the linear convergence of the alternating direction method of multipliers , 2012, Mathematical Programming.

[2]  Avideh Zakhor,et al.  Efficient video similarity measurement with video signature , 2002, Proceedings. International Conference on Image Processing.

[3]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[4]  Hagai Attias,et al.  Topic regression multi-modal Latent Dirichlet Allocation for image annotation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Dacheng Tao,et al.  A Survey on Multi-view Learning , 2013, ArXiv.

[6]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[7]  Qingming Huang,et al.  An effective multi-clue fusion approach for web video topic detection , 2012, ACM Multimedia.

[8]  Steffen Bickel,et al.  Multi-view clustering , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[9]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[10]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11]  Qingming Huang,et al.  Unsupervised Web Topic Detection Using A Ranked Clustering-Like Pattern Across Similarity Cascades , 2015, IEEE Transactions on Multimedia.

[12]  Qingming Huang,et al.  Cross-media topic detection: A multi-modality fusion framework , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[13]  Yiannis Kompatsiaris,et al.  Cluster-Based Landmark and Event Detection for Tagged Photo Collections , 2011, IEEE MultiMedia.

[14]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2011, IJCAI 2011.

[15]  Cheng Soon Ong,et al.  Multiclass multiple kernel learning , 2007, ICML '07.

[16]  Yongdong Zhang,et al.  Tracking Web Video Topics: Discovery, Visualization, and Monitoring , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Qi He,et al.  Keep It Simple with Time: A Reexamination of Probabilistic Topic Detection Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Philip S. Yu,et al.  A General Model for Multiple View Unsupervised Learning , 2008, SDM.

[19]  Chong-Wah Ngo,et al.  Novelty detection for cross-lingual news stories with visual duplicates and speech transcripts , 2007, ACM Multimedia.

[20]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[23]  Erkki Oja,et al.  Clustering by Nonnegative Matrix Factorization Using Graph Random Walk , 2012, NIPS.

[24]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..