UB at the NTCIR-12 SpokenQuery&Doc-2: Spoken Content Retrieval Using Multiple ASR Hypotheses and Syllables

The University at Buffalo (UB) team participated in the SpokenQuery&Doc task at the NTCIR-12, working on the Spoken Content Retrieval (SCR) subtask. We investigated the use of multiple ASR hypotheses (words) and subword units (syllables) for improving retrieval effectiveness. We also compared the retrieval effectiveness based on texts generated by two automatic speech recognition (ASR) engines, namely Julius and KALDI. Our experiment results showed that using multiple ASR hypotheses did not improve retrieval effectiveness, while using ASR syllables alone led to lower mean average precision than using ASR words. Furthermore, ASR texts generated by the KALDI system resulted in significantly better retrieval effectiveness than those by the Julius system. Future areas of work are discussed.

[1]  Jianqiang Wang,et al.  Combining bidirectional translation and synonymy for cross-language information retrieval , 2006, SIGIR.

[2]  Georges Linarès,et al.  Spontaneous Speech Characterization and Detection in Large Audio Database , 2009 .

[3]  Ellen M. Voorhees,et al.  The TREC Spoken Document Retrieval Track: A Success Story , 2000, TREC.

[4]  Hwee Tou Ng,et al.  Statistical lattice-based spoken document retrieval , 2010, TOIS.

[5]  Gareth J. F. Jones,et al.  Overview of the NTCIR-12 SpokenQuery&Doc-2 Task , 2016, NTCIR.

[6]  Gareth J. F. Jones,et al.  CLEF 2004 Cross-Language Spoken Document Retrieval Track , 2004, CLEF.

[7]  Alexander G. Hauptmann,et al.  Experiments in Spoken Document Retrieval at CMU , 1997, TREC.

[8]  Jianqiang Wang,et al.  Mandarin-English Information (MEI): investigating translingual speech retrieval , 2004, Comput. Speech Lang..

[9]  Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, National Center of Sciences, Tokyo, Japan, June 7-10, 2016 , 2016, NTCIR.

[10]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .

[11]  Martha Larson,et al.  Spoken Content Retrieval: A Survey of Techniques and Technologies , 2012, Found. Trends Inf. Retr..

[12]  Kenney Ng,et al.  Subword-based approaches for spoken document retrieval , 2000, Speech Commun..

[13]  W. Bruce Croft,et al.  Resolving ambiguity for cross-language retrieval , 1998, SIGIR '98.

[14]  Bhuvana Ramabhadran,et al.  Cross-Language Access to Recorded Speech in the MALACH Project , 2002, TSD.

[15]  James Allan Perspectives on Information Retrieval and Speech , 2001, SIGIR Workshop: Information Retrieval Techniques for Speech Applications.

[16]  Ryen W. White,et al.  Overview of the CLEF-2006 Cross-Language Speech Retrieval Track , 2006, CLEF.