Topical unit classification using deep neural nets and probabilistic sampling

Understanding topical units is important for improved human-computer interaction (HCI) as well as for a better understanding of human-human interaction. Here, we take the first steps towards topical unit recognition by creating a topical unit classifier based on the HuComTech multimodal database. We create this classifier by means of Deep Rectifier Neural Nets (DRN) and the Unweighted Average Recall (UAR) metric, applying the technique of probabilistic sampling. We demonstrate in several experiments that our proposed method attains a convincingly better performance than that using a support vector machine or a deep neural net by itself. We also experiment with the number of topical unit labels, and examine whether distinguishing between different types of topic changes based on the level of motivatedness is feasible in this framework.

[1]  Hervé Bourlard,et al.  Detecting speaker roles and topic changes in multiparty conversations using latent topic models , 2014, INTERSPEECH.

[2]  Alan P. Schmidt Detection of Topic Change in IRC Chat Logs , 2003 .

[3]  I. Szekrenyes,et al.  Annotation of spoken syntax in relation to prosody and multimodal pragmatics , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[4]  Margaret Zellers Fundamental Frequency and Other Prosodic Cues to Topic Structure , 2009 .

[5]  G. Esfandiari Baiat,et al.  Topic change detection based on prosodic cues in unimodal setting , 2012, 2012 IEEE 3rd International Conference on Cognitive Infocommunications (CogInfoCom).

[6]  Ágnes Abuczki,et al.  An overview of multimodal corpora, annotation tools and schemes , 2013 .

[7]  P. Baranyi,et al.  Definition and synergies of cognitive infocommunications , 2012 .

[8]  László Tóth,et al.  Training HMM/ANN Hybrid Speech Recognizers by Probabilistic Sampling , 2005, ICANN.

[9]  Peter Baranyi,et al.  Cognitive infocommunications: CogInfoCom , 2010, 2010 11th International Symposium on Computational Intelligence and Informatics (CINTI).

[10]  László Tóth Phone recognition with deep sparse rectifier neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Sven Teresniak,et al.  Towards Automatic Detection and Tracking of Topic Change , 2010, CICLing.

[12]  Ah Chung Tsoi,et al.  Neural Network Classification and Prior Class Probabilities , 1996, Neural Networks: Tricks of the Trade.

[13]  T. IstvánNagy,et al.  Document Classification with Deep Rectifier Neural Networks and Probabilistic Sampling , 2014, TSD.

[14]  Andrew Rosenberg,et al.  Classifying Skewed Data: Importance Weighting to Optimize Average Recall , 2012, INTERSPEECH.

[15]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[16]  Gökhan Tür,et al.  Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation , 2001, CL.

[17]  Istvan Szekrenyes ProsoTool, a method for automatic annotation of fundamental frequency , 2015, 2015 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom).

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.