Mining the Web for Multimedia-Based Enriching

As the amount of social media shared on the Internet grows increasingly, it becomes possible to explore a topic with a novel, people based viewpoint. We aim at performing topic enriching using media items mined from social media sharing platforms. Nevertheless, such data collected from the Web is likely to contain noise, hence the need to further process collected documents to ensure relevance. To this end, we designed an approach to automatically propose a cleaned set of media items related to events mined from search trends. Events are described using word tags and a pool of videos is linked to each event in order to propose relevant content. This pool has previously been filtered out from non-relevant data using information retrieval techniques. We report the results of our approach by automatically illustrating the popular moments of four celebrities.

[1]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[2]  Stephen E. Robertson,et al.  Selecting good expansion terms for pseudo-relevance feedback , 2008, SIGIR '08.

[3]  W. Bruce Croft,et al.  A language modeling approach to information retrieval , 1998, SIGIR '98.

[4]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[5]  Chirag Shah,et al.  Selection and context scoping for digital video collections: an investigation of youtube and blogs , 2008, JCDL '08.

[6]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[7]  Benoit Huet,et al.  Socially motivated multimedia topic timeline summarization , 2013, SAM '13.

[8]  Yang Xu,et al.  Query dependent pseudo-relevance feedback based on wikipedia , 2009, SIGIR.

[9]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[10]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[11]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[12]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[13]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[14]  Clara Pizzuti,et al.  Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.