Using Document Summarization Techniques for Speech Data Subset Selection

In this paper we leverage methods from submodular function optimization developed for document summarization and apply them to the problem of subselecting acoustic data. We evaluate our results on data subset selection for a phone recognition task. Our framework shows significant improvements over random selection and previously proposed methods using a similar amount of resources.

[1]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[2]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Alexander H. Waibel,et al.  Unsupervised training of a speech recognizer using TV broadcasts , 1998, ICSLP.

[4]  Jack Edmonds,et al.  Submodular Functions, Matroids, and Certain Polyhedra , 2001, Combinatorial Optimization.

[5]  Jean-Luc Gauvain,et al.  Lightly supervised and unsupervised acoustic model training , 2002, Comput. Speech Lang..

[6]  Dilek Z. Hakkani-Tür,et al.  Active learning for automatic speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Roger K. Moore A comparison of the data requirements of automatic speech recognition systems and human listeners , 2003, INTERSPEECH.

[8]  Juho Rousu,et al.  Efficient Computation of Gapped Substring Kernels on Large Alphabets , 2005, J. Mach. Learn. Res..

[9]  Rong Zhang,et al.  Data selection for speech recognition , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[10]  Berlin Chen,et al.  Training data selection for improving discriminative training of acoustic models , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[11]  Hui Lin,et al.  How to select a good training-data subset for transcription: submodular active selection for sequences , 2009, INTERSPEECH.

[12]  Andreas Krause,et al.  Submodularity and its applications in optimized information gathering , 2011, TIST.

[13]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[14]  Tara N. Sainath,et al.  N-best entropy based data selection for acoustic modeling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).