A directed topic model applied to call center improvement

We propose subject matter expert refined topic SMERT allocation, a generative probabilistic model applicable to clustering freestyle text. SMERT models are three-level hierarchical Bayesian models in which each item is modeled as a finite mixture over a set of topics. In addition to discrete data inputs, we introduce binomial inputs. These 'high-level' data inputs permit the 'boosting' or affirming of terms in the topic definitions and the 'zapping' of other terms. We also present a collapsed Gibbs sampler for efficient estimation. The methods are illustrated using real world data from a call center. Also, we compare SMERT with three alternative approaches and two criteria. Copyright © 2015 John Wiley & Sons, Ltd.

[1]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[2]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[3]  Thomas L. Griffiths,et al.  Learning author-topic models from text corpora , 2010, TOIS.

[4]  Alexander J. Smola,et al.  Word Features for Latent Dirichlet Allocation , 2010, NIPS.

[5]  Timothy N. Rubin,et al.  Statistical topic models for multi-label document classification , 2011, Machine Learning.

[6]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[7]  Hui Xiong Combining Subject Expert Experimental Data with Standard Data in Bayesian Mixture Modeling , 2011 .

[8]  Theodore T. Allen,et al.  Pareto charting using multifield freestyle text data applied to Toyota Camry user reviews , 2012 .

[9]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[10]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[12]  Ning Zheng Discovering interpretable topics in free-style text: diagnostics, rare topics, and topic supervision , 2008 .

[13]  David M. Blei,et al.  Probabilistic topic models , 2012, Commun. ACM.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Padhraic Smyth,et al.  Combining Background Knowledge and Learned Topics , 2011, Top. Cogn. Sci..

[16]  Susan T. Dumais,et al.  Partially labeled topic models for interpretable text mining , 2011, KDD.