论文信息 - Faceted search and browsing of audio content on spoken web

Faceted search and browsing of audio content on spoken web

Spoken Web is a web of VoiceSites that can be accessed by a phone. The content in a VoiceSite is audio. Therefore Spoken Web provides an alternate to the World Wide Web (WWW) in developing regions where low Internet penetration and low literacy are barriers to accessing the conventional WWW. Searching of audio content in Spoken Web through an audio query-result interface presents two key challenges: indexing of audio content is not accurate, and the presentation of results in audio is sequential, and therefore cumbersome. In this paper, we apply the concepts of faceted search and browsing to the SpokenWeb search problem. We use the concepts of facets to index the meta-data associated with the audio content. We provide a mechanism to rank the facets based on the search results. We develop an interactive query interface that enables easy browsing of search results through the top ranked facets. To our knowledge, this is the first system to use the concepts of facets in audio search, and the first solution that provides an audio search for the rural population. We present quantitative results to illustrate the accuracy and effectiveness of the faceted search and qualitative results to highlight the usability of the interactive browsing system. The experiments have been conducted on more than 4000 audio documents collected from a live SpokenWeb VoiceSite and evaluations were carried out with 40 farmers who are the target users of the VoiceSite.

[1] Abhishek Kumar,et al. Organizational, social and operational implications in delivering ICT solutions: a telecom web case-study , 2010, ICTD 2010.

[2] Tanja Schultz,et al. Language-independent and language-adaptive acoustic modeling for speech recognition , 2001, Speech Commun..

[3] Daniel Tunkelang. Dynamic Category Sets: An Approach for Faceted Search , 2006 .

[4] Arun Kumar,et al. Content creation and dissemination by-and-for users in rural areas , 2009, 2009 International Conference on Information and Communication Technologies and Development (ICTD).

[5] Martin Svensson,et al. Using contextual metadata for enhanced reusability of mobile media objects , 2009 .

[6] Arnaud Sahuguet,et al. An audio indexing system for election video material , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7] Amol Kamat,et al. A Metadata Search Engine for Digital Language Archives , 2005, D Lib Mag..

[8] Dipanjan Chakraborty,et al. HSTP: hyperspeech transfer protocol , 2007, HT '07.

[9] Benjamin D. Brunk,et al. Toward a General Relation Browser , 2003 .

[10] Mary Czerwinski,et al. FaThumb: a facet-based interface for mobile search , 2006, CHI.

[11] Alex Acero,et al. Position Specific Posterior Lattices for Indexing Speech , 2005, ACL.

[12] Joseph Polifroni,et al. Crowd translator: on building localized speech recognizers through micropayments , 2010, OPSR.

[13] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14] Philip Barker,et al. Blogs, Wikipedia, Second Life, and beyond: From Production to Produsage , 2009 .

[15] Kevin Li,et al. Faceted metadata for image search and browsing , 2003, CHI '03.

[16] Dave Burke. Voice Extensible Markup Language (VoiceXML) , 2007 .