Topic Modeling for Large-Scale Multimedia Analysis and Retrieval

The explosion of multimedia data in social media raises a great demand for developing effective and efficient computational tools to facilitate producing, analyzing, and retrieving large-scale multimedia content. Probabilistic topic models prove to be an effective way to organize large volumes of text documents, while much fewer related models are proposed for other types of unstructured data such as multimedia content, partly due to the high computational cost. With the emergence of cloud computing, topic models are expected to become increasingly applicable to multimedia data. Furthermore, the growing demand for a deep understanding of multimedia data on the web drives the development of sophisticated machine learning methods. Thus, it is greatly desirable to develop topic modeling approaches to multimedia applications that are consistently effective, highly efficient, and easily scalable. In this chapter, we present a review of topic models for large-scale multimedia analysis. Our goal is to show the current challenges from various perspectives and to present a Significant language editing was done. please check.

[1]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[2]  Fei-Fei Li,et al.  Building and using a semantivisual image hierarchy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  David M. Blei,et al.  Hierarchical relational models for document networks , 2009, 0909.4331.

[4]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.

[5]  Edward Y. Chang,et al.  PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications , 2009, AAIM.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Randal E. Bryant,et al.  Data-Intensive Supercomputing: The case for DISC , 2007 .

[8]  Edward Y. Chang,et al.  Parallelizing Support Vector Machines on Distributed Computers , 2007, NIPS.

[9]  Kurt Keutzer,et al.  Scalable multimedia content analysis on parallel platforms using python , 2014, TOMCCAP.

[10]  Tin Yu Wu,et al.  Towards a framework for large-scale multimedia data storage and processing on Hadoop platform , 2013, The Journal of Supercomputing.

[11]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Myoungjin Kim,et al.  A Robust Cloud-Based Service Architecture for Multimedia Streaming Using Hadoop , 2013, MUSIC.

[13]  Chong Luo,et al.  Multimedia Cloud Computing , 2011, IEEE Signal Processing Magazine.

[14]  Rong Yan,et al.  Large-scale multimedia semantic concept modeling using robust subspace bagging and MapReduce , 2009, LS-MMRM '09.

[15]  Rong Yan,et al.  Mining Associated Text and Images with Dual-Wing Harmoniums , 2005, UAI.

[16]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[17]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[18]  Quan Wang,et al.  Regularized latent semantic indexing , 2011, SIGIR.

[19]  Jimmy J. Lin,et al.  Web-scale computer vision using MapReduce for multimedia data mining , 2010, MDMKDD '10.

[20]  Jordan L. Boyd-Graber,et al.  Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce , 2012, WWW.

[21]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[22]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[23]  Jinyang Li,et al.  Building fast, distributed programs with partitioned tables , 2010 .

[24]  Aart J. C. Bik,et al.  Pregel: a system for large-scale graph processing , 2010, SIGMOD Conference.

[25]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[26]  Svetha Venkatesh,et al.  Large-scale statistical modeling of motion patterns: a Bayesian nonparametric approach , 2012, ICVGIP '12.

[27]  Hao Wang,et al.  PSVM : Parallelizing Support Vector Machines on Distributed Computers , 2007 .

[28]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[29]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[31]  Alexander J. Smola,et al.  Scalable distributed inference of dynamic user interests for behavioral targeting , 2011, KDD.

[32]  C. K. Jha,et al.  MapReduce: Simplified Data Analysis of Big Data , 2015 .

[33]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .