Adaptive speech analytics: system, infrastructure, and behavior

This paper describes an adaptive system and infrastructure for Speech Analytics, based on the UIMA framework and consisting of a set of analysis engines (analytics) and control units, whose input is an unspecified and ever changing number of continuous streams of audio data and whose output is the detection of events consistent with a focus of analysis and/or the discovery of relationships among the outputs of the constituent analytics in the system. The central theme presented concerns the ability of the system to use the meta-data generated during the analysis to adapt both the behavior of the underlying analytics engines and the overall data flow to adjust the granularity and accuracy of the analysis in order to allow processing of increasing amounts of data with limited resources.

[1]  Stéphane H. Maes,et al.  Very large population text-independent speaker identification using transformation enhanced multi-grained models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Xiaoqiang Luo,et al.  A Statistical Model for Multilingual Entity Detection and Tracking , 2004, NAACL.

[3]  Mohamed Kamal Omar,et al.  Blind change detection for audio segmentation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Jirí Navrátil,et al.  The IBM system for the NIST-2002 cellular speaker verification evaluation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[5]  David A. Ferrucci,et al.  UIMA: an architectural approach to unstructured information processing in the corporate research environment , 2004, Natural Language Engineering.

[6]  Mark J. F. Gales,et al.  Automatic transcription of Broadcast News , 2002, Speech Commun..