Web-Scale Multimedia Information Networks

The abundance of multimedia data on the Web presents both challenges (how to annotate, search, and mine) and opportunities (crawling the Web to create large structured multimedia data bases which can be used to do inference effectively). Because of the huge data volume, considering all semantic concepts as on the same (flat) level is not viable. In this paper, we introduce a unified STRUCTURED representation called multimedia information networks (MINets), which incorporates ontology and cross-media links, covering both content and context knowledge. Ontology and cross-media structures are constructed and expanded by automatically constructing MINets from web-scale data by state-of-the-art information extraction and knowledge-based population techniques. The resultant MINet will contain a wide range of linkages, including logical, statistical, and semantic relations among informative concept nodes, which connects proliferative ontology as well as cross-media web-scale resources together. The raw data collected in construction phase often contain much noisy, incomplete, or even conflicting information which could be detrimental to information extraction and utilization. Then, the redundant link structure can be utilized to distill MINets and improve quality of information (QoI). Moreover, advanced inference theory and system can be built upon the linked MINets, and then high-level ontological knowledge can be inferred and integrated in a logically harmonious network structure in MINets which is consistent with human cognition. Even more, as information channels, the ontology and cross-media links in MINets connect informative knowledge resources together, which makes it possible to increase the portability of information between different resources to increase information utilization levels.

[1]  Satoshi Nakamura,et al.  Statistical multimodal integration for audio-visual speech processing , 2002, IEEE Trans. Neural Networks.

[2]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  David A. Forsyth,et al.  Describing objects by their attributes , 2009, CVPR.

[4]  A. Farag,et al.  Kernel methods for statistical learning in computer vision and pattern recognition applications , 2005 .

[5]  Heng Ji,et al.  Cross-document Event Extraction and Tracking: Task, Evaluation, Techniques and Challenges , 2009, RANLP.

[6]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[7]  Claudio Gentile,et al.  Regret Bounds for Hierarchical Classification with Linear-Threshold Functions , 2004, COLT.

[8]  Tao Mei,et al.  Correlative multi-label video annotation , 2007, ACM Multimedia.

[9]  Daphna Weinshall,et al.  Exploiting Object Hierarchy: Combining Models from Different Category Levels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Charu C. Aggarwal,et al.  Towards cross-category knowledge propagation for learning visual concepts , 2011, CVPR 2011.

[11]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[12]  Jiebo Luo,et al.  RankCompete: simultaneous ranking and clustering of web photos , 2010, WWW '10.

[13]  Yizhou Sun,et al.  Integrating Clustering with Ranking in Heterogeneous Information Networks Analysis , 2010, Link Mining.

[14]  Fei Wang,et al.  Label Propagation through Linear Neighborhoods , 2006, IEEE Transactions on Knowledge and Data Engineering.

[15]  Xiang Li,et al.  Top-Down and Bottom-Up: A Combined Approach to Slot Filling , 2010, AIRS.

[16]  Charu C. Aggarwal,et al.  Towards semantic knowledge propagation from text corpus to web images , 2011, WWW.

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Flora Amato,et al.  Information Extraction from Multimedia Documents for e-Government Applications , 2009 .

[19]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[20]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, The VLDB Journal.

[21]  Pavel Praks,et al.  Multimedia information extraction from HTML product catalogues , 2005, DATESO.

[22]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[23]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[24]  Marie-Francine Moens,et al.  Text Analysis for Automatic Image Annotation , 2007, ACL.

[25]  Jianping Fan,et al.  Hierarchical classification for automatic image annotation , 2007, SIGIR.

[26]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[27]  Pavel Praks,et al.  Information extraction from HTML product catalogues: from source code and images to RDF , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[28]  Mohammad Rahmati,et al.  A novel multimedia data mining framework for information extraction of a soccer video stream , 2009, Intell. Data Anal..

[29]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[30]  Thomas S. Huang,et al.  Hierarchical image feature extraction and classification , 2010, ACM Multimedia.

[31]  Heng Ji,et al.  Exploring Context and Content Links in Social Media: A Latent Space Method , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Antonella De Angeli,et al.  Integration and synchronization of input modes during multimodal human-computer interaction , 1997, CHI.

[33]  Yanmei Chai,et al.  OntoAlbum: An Ontology Based Digital Photo Management System , 2008, ICIAR.

[34]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[35]  Udo Kruschwitz,et al.  Linguistic) Science Through Web Collaboration in the ANAWIKI project , 2009 .

[36]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[37]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[38]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[39]  W. Marsden I and J , 2012 .

[40]  Kalina Bontcheva,et al.  Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project , 2004, Data Knowl. Eng..

[41]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[42]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[44]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  Thomas Hofmann,et al.  Hierarchical document categorization with support vector machines , 2004, CIKM '04.

[46]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[47]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[48]  Dan I. Moldovan,et al.  Exploiting ontologies for automatic image annotation , 2005, SIGIR '05.

[49]  Jia Deng,et al.  A large-scale hierarchical image database , 2009, CVPR 2009.

[50]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[51]  Pavel Velikhov,et al.  Accuracy estimate and optimization techniques for SimRank computation , 2008, Proc. VLDB Endow..

[52]  Jiebo Luo,et al.  The wisdom of social multimedia: using flickr for prediction and forecast , 2010, ACM Multimedia.

[53]  Charalampos E. Tsourakakis PEGASUS: A System for Large-Scale Graph Processing , 2014, Large Scale and Big Data.

[54]  Qi Tian,et al.  Visual ContextRank for web image re-ranking , 2009, LS-MMRM '09.

[55]  Avideh Zakhor,et al.  Efficient video similarity measurement and search , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[56]  Ming Yang,et al.  Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[57]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[58]  Éric Grégoire,et al.  An unbiased approach to iterated fusion by weakening , 2006, Inf. Fusion.

[59]  Yi Wu,et al.  Ontology-based multi-classification learning for video concept detection , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[60]  Indranil Gupta,et al.  Delta-SimRank computing on MapReduce , 2012, BigMine '12.

[61]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[62]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[63]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[64]  Christos Faloutsos,et al.  PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[65]  Jiebo Luo,et al.  Diversified Trajectory Pattern Ranking in Geo-tagged Social Media , 2011, SDM.

[66]  Yansong Feng,et al.  Automatic Image Annotation Using Auxiliary Text Information , 2008, ACL.

[67]  Robert P. Cook,et al.  Freebase: A Shared Database of Structured General Human Knowledge , 2007, AAAI.

[68]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[69]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[70]  Rudolf Kruse,et al.  Fusion: General concepts and characteristics , 2001, Int. J. Intell. Syst..

[71]  Takahiro Hara,et al.  Wikipedia Link Structure and Text Mining for Semantic Relation Extraction , 2008, SemSearch.

[72]  Avideh Zakhor,et al.  Efficient video similarity measurement and search , 2000 .

[73]  Tao Mei,et al.  Multi-layer multi-instance kernel for video concept detection , 2007, ACM Multimedia.

[74]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[75]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[76]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[77]  Fabio Ciravegna,et al.  Exploring multimedia in a keyword space , 2008, ACM Multimedia.

[78]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[79]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[80]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[81]  Jerry R. Hobbs,et al.  Learning by Reading: A Prototype System, Performance Baseline and Lessons Learned , 2007, AAAI.

[82]  Lenhart K. Schubert,et al.  Open Knowledge Extraction through Compositional Language Processing , 2008, STEP.

[83]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.