Topic subject creation using unsupervised learning for topic modeling

We describe the use of Non-Negative Matrix Factorization (NMF) and Latent Dirichlet Allocation (LDA) algorithms to perform topic mining and labelling applied to retail customer communications in attempt to characterize the subject of customers inquiries. In this paper we compare both algorithms in the topic mining performance and propose methods to assign topic subject labels in an automated way.

[1]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[2]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[3]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[4]  Jie Zhang,et al.  TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation , 2014, AAAI.

[5]  Pablo Tamayo,et al.  Metagenes and molecular pattern discovery using matrix factorization , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Chao Chen,et al.  Partial membership latent Dirichlet allocation for image segmentation , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[7]  Snehanshu Saha,et al.  ASTROMLSKIT: A New Statistical Machine Learning Toolkit: A Platform for Data Analytics in Astronomy , 2015, ArXiv.

[8]  Shaowen Yao,et al.  An overview of topic modeling and its current applications in bioinformatics , 2016, SpringerPlus.

[9]  Nicolas Gillis,et al.  Document classification using nonnegative matrix factorization and underapproximation , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[10]  C. Joblin,et al.  Analysis of the emission of very small dust particles from Spitzer spectro-imagery data using blind signal separation methods , 2007 .

[11]  Sutanu Chakraborti,et al.  Document classification by topic labeling , 2013, SIGIR.

[12]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[13]  Minchao Ye,et al.  Correction: Using Dynamic Multi-Task Non-Negative Matrix Factorization to Detect the Evolution of User Preferences in Collaborative Filtering , 2015, PloS one.

[14]  Salim Khan,et al.  Use of Sentiment Mining and Online NMF for Topic Modeling Through the Analysis of Patients Online Unstructured Comments , 2018, ICSH.

[15]  Ivan Bajla,et al.  Robust Object Recognition under Partial Occlusions Using NMF , 2008, Comput. Intell. Neurosci..

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  H. Zheng,et al.  Feature selection for high dimensional data in astronomy , 2007, 0709.0138.

[18]  Derek Greene,et al.  Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach , 2016, Political Analysis.

[19]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[20]  Francisco Tirado,et al.  bioNMF: a web-based tool for nonnegative matrix factorization in biology , 2008, Nucleic Acids Res..