Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop

A group of junior and senior researchers gathered as a part of the 2014 Frederick Jelinek Memorial Workshop in Prague to address the problem of predicting the accuracy of a nonlinear Deep Neural Network probability estimator for unknown data in a different application domain from the domain in which the estimator was trained. The paper describes the problem and summarizes approaches that were taken by the group1.

[1]  Hynek Hermansky,et al.  Adaptive Stream Fusion in Multistream Recognition of Speech , 2011, INTERSPEECH.

[2]  Hynek Hermansky,et al.  Sub-band based recognition of noisy speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  J. Cohen Segmenting speech using dynamic programming. , 1981, The Journal of the Acoustical Society of America.

[4]  Tetsuji Ogawa,et al.  Stream selection and integration in multistream ASR using GMM-based performance monitoring , 2013, INTERSPEECH.

[5]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[6]  Tetsuji Ogawa,et al.  A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[7]  Philip C. Woodland,et al.  Combining Information Sources for Confidence Estimation with CRF Models , 2011, INTERSPEECH.

[8]  Hynek Hermansky,et al.  Spectral entropy based feature for robust ASR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Hynek Hermansky,et al.  Mean temporal distance: Predicting ASR error from temporal properties of speech signal , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Richard C. Rose,et al.  Manifold regularized deep neural networks , 2014, INTERSPEECH.

[11]  Alexandros Potamianos,et al.  Multi-band speech recognition in noisy environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Tetsuji Ogawa,et al.  Autoencoder based multi-stream combination for noise robust speech recognition , 2015, INTERSPEECH.

[13]  Misha Pavel,et al.  Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[14]  Hynek Hermansky,et al.  Multistream Recognition of Speech: Dealing With Unknown Unknowns , 2013, Proceedings of the IEEE.

[15]  Hynek Hermansky,et al.  Multi-stream recognition of noisy speech with performance monitoring , 2013, INTERSPEECH.

[16]  Hervé Bourlard,et al.  A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[17]  David Windridge,et al.  Domain Anomaly Detection in Machine Perception: A System Architecture and Taxonomy , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.