Large-scale joint topic, sentiment & user preference analysis for online reviews

This paper presents a non-trivial reconstruction of a previous joint topic-sentiment-preference review model TSPRA with stick-breaking representation under the framework of variational inference (VI) and stochastic variational inference (SVI). TSPRA is a Gibbs Sampling based model that solves topics, word sentiments and user preferences altogether and has been shown to achieve good performance, but for large dataset it can only learn from a relatively small sample. We develop the variational models vTSPRA and svTSPRA to improve the time use, and our new approach is capable of processing millions of reviews. We rebuild the generative process, improve the rating regression, solve and present the coordinate-ascent updates of variational parameters, and show the time complexity of each iteration is theoretically linear to the corpus size, and the experiments on Amazon datasets show it converges faster than TSPRA and attains better results given the same amount of time. In addition, we tune svTSPRA into an online algorithm ovTSPRA that can monitor oscillations of sentiment and preference overtime. Some interesting fluctuations are captured and possible explanations are provided. The results give strong visual evidence that user preference is better treated as an independent factor from sentiment.

[1]  Xiaohua Hu,et al.  Unifying Topic, Sentiment & Preference in an HDP-Based Rating Regression Model for Online Reviews , 2016, ACML.

[2]  Marcus Hutter,et al.  A Bayesian Review of the Poisson-Dirichlet Process , 2010, ArXiv.

[3]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[4]  Björn W. Schuller,et al.  SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives , 2016, COLING.

[5]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[6]  Alexander J. Smola,et al.  Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS) , 2014, KDD.

[7]  Weihua Li,et al.  Recursive PCA for adaptive process monitoring , 1999 .

[8]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[9]  Martin Ester,et al.  FLAME: A Probabilistic Model Combining Aspect Based Opinion Mining and Collaborative Filtering , 2015, WSDM.

[10]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[11]  Wai Lam,et al.  Collaborative Filtering Incorporating Review Text and Co-clusters of Hidden User Communities and Item Groups , 2014, CIKM.

[12]  Tim Hesterberg,et al.  Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.

[13]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[14]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[15]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[16]  Swapnil Mishra,et al.  Experiments with non-parametric topic models , 2014, KDD.