Predicting podcast preference: An analysis framework and its application

Finding worthwhile podcasts can be difficult for listeners since podcasts are published in large numbers and vary widely with respect to quality and repute. Independently of their informational content, certain podcasts provide satisfying listening material while other podcasts have little or no appeal. In this paper we present PodCred, a framework for analyzing listener appeal, and we demonstrate its application to the task of automatically predicting the listening preferences of users. First, we describe the PodCred framework, which consists of an inventory of factors contributing to user perceptions of the credibility and quality of podcasts. The framework is designed to support automatic prediction of whether or not a particular podcast will enjoy listener preference. It consists of four categories of indicators related to the Podcast Content, the Podcaster, the Podcast Context, and the Technical Execution of the podcast. Three studies contributed to the development of the PodCred framework: a review of the literature on credibility for other media, a survey of prescriptive guidelines for podcasting, and a detailed data analysis. Next, we report on a validation exercise in which the PodCred framework is applied to a real-world podcast preference prediction task. Our validation focuses on select framework indicators that show promise of being both discriminative and readily accessible. We translate these indicators into a set of easily extractable “surface” features and use them to implement a basic classification system. The experiments carried out to evaluate system use popularity levels in iTunes as ground truth and demonstrate that simple surface features derived from the PodCred framework are indeed useful for classifying podcasts.

[1]  M. de Rijke,et al.  Using Coherence-Based Measures to Predict Query Difficulty , 2008, ECIR.

[2]  Soo Young Rieh,et al.  Credibility: A multidisciplinary framework , 2007, Annu. Rev. Inf. Sci. Technol..

[3]  Soo Young Rieh Judgment of information quality and cognitive authority in the Web , 2002, J. Assoc. Inf. Sci. Technol..

[4]  George Ghinea,et al.  Quality of perception: user quality of service in multimedia presentations , 2005, IEEE Transactions on Multimedia.

[5]  Thijs Westerveld,et al.  Surface Features in Video Retrieval , 2005, Adaptive Multimedia Retrieval.

[6]  Soo-Min Kim,et al.  Automatically Assessing Review Helpfulness , 2006, EMNLP.

[7]  Kerry Matthews RESEARCH INTO PODCASTING TECHNOLOGY INCLUDING CURRENT AND POSSIBLE FUTURE USES , 2006 .

[8]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[9]  George Ghinea,et al.  Measuring quality of perception in distributed multimedia: Verbalizers vs. imagers , 2008, Comput. Hum. Behav..

[10]  Nicholas J. Belkin,et al.  Understanding Judgment of Information Quality and Cognitive Authority in the WWW , 1998 .

[11]  Dan Klass,et al.  Podcast Solutions: The Complete Guide to Podcasting (Solutions) , 1970 .

[12]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[13]  Miriam J. Metzger,et al.  Credibility for the 21st Century: Integrating Perspectives on Source, Message, and Media Credibility in the Contemporary Media Environment , 2003 .

[14]  F. V. Gils,et al.  PodVinder : spoken document retrieval for Dutch pod- and vodcasts , 2008 .

[15]  M. de Rijke,et al.  Exploiting Surface Features for the Prediction of Podcast Preference , 2009, ECIR.

[16]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[17]  Elizabeth D. Liddy,et al.  Assessing Credibility of Weblogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[18]  Soo Young Rieh,et al.  Developing a unifying framework of credibility assessment: Construct, heuristics, and interaction in context , 2008, Inf. Process. Manag..

[19]  Matthew J. Salganik,et al.  Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market , 2006, Science.

[20]  Bernardo A. Huberman,et al.  Predicting the popularity of online content , 2008, Commun. ACM.

[21]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[22]  Miriam J. Metzger Making sense of credibility on the Web: Models for evaluating online information and recommendations for future research , 2007, J. Assoc. Inf. Sci. Technol..

[23]  Max Mühlhäuser,et al.  Automatically Assessing the Post Quality in Online Discussions on Software , 2007, ACL.

[24]  Masataka Goto,et al.  Automatic transcription for a web 2.0 service to search podcasts , 2007, INTERSPEECH.

[25]  Òscar Celma,et al.  ZemPod: A semantic web approach to podcasting , 2008, J. Web Semant..

[26]  Gilad Mishne,et al.  YR-2007-005 FINDING HIGH-QUALITY CONTENT IN SOCIAL MEDIA WITH AN APPLICATION TO COMMUNITY-BASED QUESTION ANSWERING , 2007 .

[27]  B. J. Fogg,et al.  Credibility and computing technology , 1999, CACM.

[28]  Gilad Mishne,et al.  Applied text analytics for blogs , 2007 .

[29]  M. de Rijke,et al.  PodCred: a framework for analyzing podcast preference , 2008, WICOW '08.

[30]  Laurie J. Patterson The Technology Underlying Podcasts , 2006, Computer.