Exploiting Surface Features for the Prediction of Podcast Preference

Podcasts display an unevenness characteristic of domains dominated by user generated content, resulting in potentially radical variation of the user preference they enjoy. We report on work that uses easily extractable surface features of podcasts in order to achieve solid performance on two podcast preference prediction tasks: classification of preferred vs. non-preferred podcasts and ranking podcasts by level of preference. We identify features with good discriminative potential by carrying out manual data analysis, resulting in a refinement of the indicators of an existent podcast preference framework. Our preference prediction is useful for topic-independent ranking of podcasts, and can be used to support download suggestion or collection browsing.

[1]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[2]  M. de Rijke,et al.  Credibility Improves Topical Blog Post Retrieval , 2008, ACL.

[3]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[4]  Thijs Westerveld,et al.  Surface Features in Video Retrieval , 2005, Adaptive Multimedia Retrieval.

[5]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[6]  Stéphane Marchand-Maillet Adaptive Multimedia Retrieval: User, Context, and Feedback, 4th International Workshop, AMR 2006, Geneva, Switzerland, July 27-28, 2006, Revised Selected Papers , 2007, Adaptive Multimedia Retrieval.

[7]  M. de Rijke,et al.  PodCred: a framework for analyzing podcast preference , 2008, WICOW '08.

[8]  Max Mühlhäuser,et al.  Automatically Assessing the Post Quality in Online Discussions on Software , 2007, ACL.

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  Michael W. Geoghegan,et al.  Podcast Solutions: The Complete Guide to Audio and Video Podcasting, Second Edition (Solutions) , 2007 .

[11]  Franciska de Jong,et al.  Multimedia Search Without Visual Analysis: The Value of Linguistic and Contextual Information , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Dan Klass,et al.  Podcast Solutions: The Complete Guide to Podcasting (Solutions) , 1970 .

[13]  Gilad Mishne,et al.  Applied text analytics for blogs , 2007 .

[14]  Miriam J. Metzger,et al.  Credibility for the 21st Century: Integrating Perspectives on Source, Message, and Media Credibility in the Contemporary Media Environment , 2003 .