Podcast search: user goals and retrieval technologies

Purpose - This research aims to identify users' goals and strategies when searching for podcasts and their impact on the design of podcast retrieval technology. In particular, the paper seeks to explore the potential to address user goals with indexing based on podcast metadata and automatic speech recognition (ASR) transcripts. Design/methodology/approach - The paper conducted a user study to obtain an overview of podcast search behaviour and goals, using a multi-method approach of an online survey, a diary study, and contextual interviews. In a subsequent podcast retrieval experiment, the paper investigated the retrieval performance of the two choices of indexing features for search goals identified during the study. Findings - The paper found that study participants used a variety of search strategies, partially influenced by available tools and their perceptions of these tools. Furthermore the experimental results revealed that retrieval using ASR transcripts performed significantly better than metadata-based searching. However, a detailed result analysis suggested that the efficacy of the indexing methods was search-goal dependent. Research limitations/implications - The research constitutes a step towards a future framework for investigating user needs and addressing them in an experimental set-up. It was primarily qualitative and exploratory in nature. Practical implications - Podcast search engines require evidence about suitable indexing methods in order to make an informed decision concerning whether it is worth the resources to generate speech recognition transcripts. Originality/value - Systematic studies of podcast searching have not previously been reported. Investigations of this kind hold the potential to optimise podcast retrieval in the long term.

[1]  Doug Downey,et al.  Understanding the relationship between searchers' queries and information goals , 2008, CIKM '08.

[2]  Arnaud Sahuguet,et al.  An audio indexing system for election video material , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Ryen W. White,et al.  Overview of the CLEF-2005 Cross-Language Speech Retrieval Track , 2005, CLEF.

[4]  Steve Renals,et al.  Indexing and retrieval of broadcast news , 2000, Speech Commun..

[5]  Chris Evans,et al.  The effectiveness of m-learning in the form of podcast revision lectures in higher education , 2008, Comput. Educ..

[6]  B. Berg Qualitative Research Methods for the Social Sciences , 1989 .

[7]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[8]  Timothy J. Hazen,et al.  Retrieval and browsing of spoken content , 2008, IEEE Signal Processing Magazine.

[9]  S. Renals,et al.  Content-based access to spoken audio , 2005, IEEE Signal Processing Magazine.

[10]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[11]  Elizabeth Shriberg,et al.  Spontaneous speech: how people really talk and why engineers should care , 2005, INTERSPEECH.

[12]  Masataka Goto,et al.  Podcastle: collaborative training of acoustic models on the basis of wisdom of crowds for podcast transcription , 2009, INTERSPEECH.

[13]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[14]  Lars Kai Hansen,et al.  Castsearch - Context Based Spoken Document Retrieval , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[15]  Yan Zhang Undergraduate students' mental models of the Web as an information retrieval system , 2008 .

[16]  Wolfgang Hürst,et al.  An evaluation of the mobile usage of e-lecture podcasts , 2007, Mobility '07.

[17]  Masataka Goto,et al.  Podcastle: a web 2.0 approach to speech recognition research , 2007, INTERSPEECH.

[18]  Beth Logan,et al.  Real-world audio indexing systems , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[19]  Charles L. A. Clarke,et al.  Classifying and Characterizing Query Intent , 2009, ECIR.

[20]  Masataka Goto,et al.  PodCastle: a spoken document retrieval system for podcasts and its performance improvement by anonymous user contributions , 2009, SSCS '09.

[21]  T. Diefenbach Are case studies more than sophisticated storytelling?: Methodological problems of qualitative empirical research mainly based on semi-structured interviews , 2009 .

[22]  Karen Spärck Jones,et al.  Automatic content-based retrieval of broadcast news , 1995, MULTIMEDIA '95.

[23]  Michael A. Shepherd,et al.  A Goal-based Classification of Web Information Tasks , 2006, ASIST.

[24]  Gary Marchionini,et al.  Finding facts vs. browsing knowledge in hypertext systems , 1988, Computer.

[25]  Jaime Teevan,et al.  Query log analysis: social and technological challenges , 2007, SIGF.

[26]  Jan Alexandersson,et al.  A Comprehensive Disfluency Model for Multi-Party Interaction , 2007, SIGdial.

[27]  Gilad Mishne,et al.  A Study of Blog Search , 2006, ECIR.

[28]  Debra J. Slone The influence of mental models and goals on search patterns during Web interaction , 2002, J. Assoc. Inf. Sci. Technol..

[29]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[30]  Jennifer L. Dyck,et al.  iTunes University and the classroom: Can podcasts replace Professors? , 2009, Comput. Educ..

[31]  T. D. Wilson,et al.  Review of: Berg, Bruce L. Qualitative research methods for the social sciences, 6th ed. Boston, MA: Allyn and Bacon, 2007 , 2008, Inf. Res..

[32]  Mathias Lux,et al.  An Exploratory Study on the Explicitness of User Intentions in Digital Photo Retrieval , 2009 .

[33]  Amanda Spink,et al.  Determining the informational, navigational, and transactional intent of Web queries , 2008, Inf. Process. Manag..

[34]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[35]  Martha Larson,et al.  Predicting podcast preference: An analysis framework and its application , 2010 .

[36]  Daniel M. Russell,et al.  Query logs alone are not enough , 2007 .

[37]  F. V. Gils,et al.  PodVinder : spoken document retrieval for Dutch pod- and vodcasts , 2008 .

[38]  Junta Mizuno,et al.  A similar content retrieval method for podcast episodes , 2008, 2008 IEEE Spoken Language Technology Workshop.

[39]  Carol A. Hert User Goals on an Online Public Access Catalog , 1996, J. Am. Soc. Inf. Sci..

[40]  Katja Hofmann,et al.  An Exploratory Study of User Goals and Strategies in Podcast Search , 2008, LWA.

[41]  Mun-Young Chung,et al.  Podcast use motivations and patterns among college students , 2008 .

[42]  Jette Hyldegård,et al.  Using diaries in group based information behavior research: a methodological study , 2006, IIiX.

[43]  Wanda Pratt,et al.  Transparent Queries: investigation users' mental models of search engines , 2001, SIGIR '01.

[44]  Alex Acero,et al.  Soft indexing of speech content for search in spoken documents , 2007, Comput. Speech Lang..

[45]  Amanda Spink,et al.  A study and comparison of multimedia Web searching: 1997–2006 , 2009 .

[46]  Kuan-Yu He,et al.  Improving Identification of Latent User Goals through Search-Result Snippet Classification , 2007 .

[47]  Bernard J. Jansen,et al.  Search log analysis: What it is, what's been done, how to do it , 2006 .

[48]  Alexander G. Hauptmann,et al.  Informedia: news-on-demand multimedia information acquisition and retrieval , 1997 .