Telefonica Research System for the Spoken Web Search task at Mediaeval 2012

In this paper we describe the systems presented by Telefonica Research to the Spoken Web Search task of the Mediaeval 2012 evaluation. This year we proposed two systems. The rst one consists on a segmental DTW system, similar to the one presented in 2011, with a few improvements. The second system also uses a DTW-like approach but allowing for all reference les o be searched at once using an information retrieval approach.

[1]  Bin Ma,et al.  An acoustic segment modeling approach to query-by-example spoken term detection , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Jhing-Fa Wang,et al.  Unsupervised speaker change detection using SVM training misclassification rate , 2007, IEEE Transactions on Computers.

[3]  Aren Jansen,et al.  The JHU-HLTCOE Spoken Web Search System for MediaEval 2012 , 2012, MediaEval.

[4]  Florian Metze,et al.  The Spoken Web Search Task , 2012, MediaEval.

[5]  Javier Tejedor,et al.  Novel methods for query selection and query combination in query-by-example spoken term detection , 2010, SSCS '10.

[6]  Martin Karafiát,et al.  Hierarchical neural net architectures for feature extraction in ASR , 2010, INTERSPEECH.

[7]  X. Anguera Speaker independent discriminant feature extraction for acoustic pattern-matching , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Florian Metze,et al.  The Speech Recognition Virtual Kitchen: An Initial Prototype , 2012, INTERSPEECH.

[9]  Lukás Burget,et al.  Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech , 2005, TSD.

[10]  Florian Metze,et al.  The Spoken Web Search Task at MediaEval 2011 , 2012, ICASSP.

[11]  Dipanjan Chakraborty,et al.  WWTW: the world wide telecom web , 2007, NSDR '07.

[12]  Sougata Mukherjea,et al.  Faceted search and browsing of audio content on spoken web , 2010, CIKM.

[13]  Mireia Díez,et al.  GTTS System for the Spoken Web Search Task at MediaEval 2012 , 2012, MediaEval.

[14]  Etienne Barnard,et al.  ASR corpus design for resource-scarce languages , 2009, INTERSPEECH.

[15]  Bin Ma,et al.  Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Florian Metze,et al.  Spoken Web Search , 2011, MediaEval.

[17]  Isabel Trancoso,et al.  The L2F Broadcast News Speech Recognition System , 2010 .

[18]  Xavier Anguera Miró,et al.  Speed improvements to Information Retrieval-based dynamic time warping using hierarchical K-Means clustering , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  James R. Glass,et al.  Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[20]  Aren Jansen,et al.  Indexing Raw Acoustic Features for Scalable Zero Resource Search , 2012, INTERSPEECH.

[21]  Björn W. Schuller,et al.  The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task , 2012 .

[22]  Xavier Anguera Telefonica System for the Spoken Web Search Task at Mediaeval 2011 , 2011 .

[23]  Simon King,et al.  Stochastic Pronunciation Modeling for Out-of-Vocabulary Spoken Term Detection , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[24]  Horia Cucu,et al.  Investigating the role of machine translated text in ASR domain adaptation: Unsupervised and semi-supervised methods , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[25]  Jozef Vavrek,et al.  TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM , 2012, MediaEval.

[26]  Bhuvana Ramabhadran,et al.  Web derived pronunciations for spoken term detection , 2009, SIGIR.

[27]  Jonathan G. Fiscus,et al.  Results of the 2006 Spoken Term Detection Evaluation , 2006 .

[28]  Tan Lee,et al.  CUHK System for the Spoken Web Search task at Mediaeval 2012 , 2012, MediaEval.

[29]  Horia Cucu,et al.  ARF @ MediaEval 2012: A Romanian ASR-based Approach to Spoken Term Detection , 2012, MediaEval.

[30]  Ramón Fernández Astudillo,et al.  The L2F Spoken Web Search system for Mediaeval 2012 , 2012, MediaEval.