论文信息 - Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation

Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation

This study proposes a conceptual dynamic latent Dirichlet allocation (CDLDA) model for topic detection and tracking in conversational content. Topic detection and tracking is vital for conversational communication, especially for spoken interactions. Because topic transitions occur frequently during conversational communication (i.e., a conversation usually contains many topics), language processors must detect different topics in conversational content. Considering the structure of spoken dialogue, the dynamic model was employed in this study to capture the sequence of two adjacent topics in spoken content. The proposed model applies the proportions of verbs and nouns to analyze the similarity between utterances. An agglomerative clustering algorithm, based on an ontology defined in E-HowNet, clusters conversational utterances. Because the topic structure of conversational content is friable, E-HowNet uses hypernym relationships of speech acts to obtain robust solutions, even for sparse data. Compared with the traditional latent Dirichlet allocation (LDA) model, which detects topics only through a bag-of-words technique, the proposed model considers temporal features by introducing dynamic concepts. Experimental results revealed that the proposed approach outperformed the traditional DLDA and LDA and support vector machine models, in addition to achieving excellent performance for topic detection and tracking in conversations.

[1] P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[2] Craig H. Martell,et al. Topic Detection and Extraction in Chat , 2008, 2008 IEEE International Conference on Semantic Computing.

[3] Larry Gillick,et al. A hidden Markov model approach to text segmentation and event tracking , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4] Gary Geunbae Lee,et al. CHAT AND GOAL-ORIENTED DIALOG TOGETHER: A UNIFIED EXAMPLE-BASED ARCHITECTURE FOR MULTI-DOMAIN DIALOG MANAGEMENT , 2006, 2006 IEEE Spoken Language Technology Workshop.

[5] Yang Song,et al. Topical Keyphrase Extraction from Twitter , 2011, ACL.

[6] Guanghui Wang,et al. Scene and place recognition using a hierarchical latent topic model , 2015, Neurocomputing.

[7] Helena Moniz,et al. Recognition of classroom lectures in european portuguese , 2006, INTERSPEECH.

[8] Haizhou Li,et al. IRIS: a Chat-oriented Dialogue System based on the Vector Space Model , 2012, ACL.

[9] Xiao Liu,et al. Learning to Track Multiple Targets , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10] Kornel Laskowski,et al. Advances in lecture recognition: the ISL RT-06s evaluation system , 2006, INTERSPEECH.

[11] Ramón López-Cózar,et al. Using knowledge on word-islands to improve the performance of spoken dialogue systems , 2015, Knowl. Based Syst..

[12] Jun Yu,et al. Human pose recovery by supervised spectral embedding , 2015, Neurocomputing.

[13] Andrew Olney,et al. An Orthonormal Basis for Topic Segmentation in Tutorial Dialogue , 2005, HLT.

[14] James Allan,et al. UMass at TDT 2000 , 2000 .

[15] Matthew Purver,et al. Meeting Structure Annotation , 2008 .