When calls go wrong: how to detect problematic calls based on log-files and emotions?

Traditionally, the prediction of problematic calls in Interactive Voice Response systems in call centers has been based either on dialog state transitions and recognition log data, or on caller emotion. We present a combined model incorporating both types of feature sets that achieved 79.22% classification accuracy of problematic and non-problematic calls after only the first four turns in a human-computer dialogue. We found that using acoustic features to indicate caller emotion did not yield any significant increase of accuracy.