Mediapedia: Mining Web Knowledge to Construct Multimedia Encyclopedia

In recent years, we have witnessed the blooming of Web 2.0 content such as Wikipedia, Flickr and YouTube, etc. How might we benefit from such rich media resources available on the internet? This paper presents a novel concept called Mediapedia, a dynamic multimedia encyclopedia that takes advantage of, and in fact is built from the text and image resources on the Web. The Mediapedia distinguishes itself from the traditional encyclopedia in four main ways. (1) It tries to present users with multimedia contents (e.g., text, image, video) which we believed are more intuitive and informative to users. (2) It is fully automated because it downloads the media contents as well as the corresponding textual descriptions from the Web and assembles them for presentation. (3) It is dynamic as it will use the latest multimedia content to compose the answer. This is not true for the traditional encyclopedia. (4) The design of Mediapedia is flexible and extensible such that we can easily incorporate new kinds of mediums such as video and languages into the framework. The effectiveness of Mediapedia is demonstrated and two potential applications are described in this paper.

[1]  Teofilo F. GONZALEZ,et al.  Clustering to Minimize the Maximum Intercluster Distance , 1985, Theor. Comput. Sci..

[2]  Xian-Sheng Hua,et al.  Finding image exemplars using fast sparse affinity propagation , 2008, ACM Multimedia.

[3]  Meng Wang,et al.  Automatic video annotation by semi-supervised learning with kernel density estimation , 2006, MM '06.

[4]  Shuicheng Yan,et al.  Inferring semantic concepts from community-contributed images and noisy tags , 2009, ACM Multimedia.

[5]  Hung-Khoon Tan,et al.  Event driven summarization for web videos , 2009, WSM '09.

[6]  C. Ding A similarity-based probability model for latent semantic indexing , 1999, SIGIR '99.

[7]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[8]  Hung-Khoon Tan,et al.  Beyond search: Event-driven summarization for web videos , 2011, TOMCCAP.

[9]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.

[10]  Tat-Seng Chua,et al.  Word 2 Image : Towards Visual Interpretation of Words , 2008 .

[11]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[12]  Yang Song,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Meng Wang,et al.  Visual tag dictionary: interpreting tags with visual words , 2009, WSMC '09.

[14]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[15]  Tat-Seng Chua,et al.  From text question-answering to multimedia QA on web-scale media resources , 2009, LS-MMRM '09.

[16]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[17]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[18]  Tat-Seng Chua,et al.  Summarizing Definition from Wikipedia , 2009, ACL.

[19]  Tat-Seng Chua,et al.  Word2Image: towards visual interpreting of words , 2008, ACM Multimedia.

[20]  Mary Beagon PLINY'S CATALOGUE OF CULTURE: ART AND EMPIRE IN THE NATURAL HISTORY , 2005 .

[21]  Sorcha Carey Pliny's Catalogue of Culture: Art and Empire in the Natural History , 2003 .