Integrating Topic Model and Heterogeneous Information Network for Aspect Mining with Rating Bias

Recently, there is a surge of research on aspect mining, where the goal is to predict aspect ratings of shops with reviews and overall ratings. Traditional methods assumed that aspect ratings in a specific review text are of the same level, which equal to the corresponding overall rating. However, recent research reveals a different phenomenon: there is an obvious rating bias between aspect ratings and overall ratings. Moreover, these methods usually analyze aspect ratings of reviews with topic models at textual level, while totally ignore potentially structural information among multiple entities (users, shops, reviews), which can be captured by a Heterogeneous Information Network (HIN). In this paper, we present a novel model integrating Topic model and HIN for Aspect Mining with rating bias (called THAM). Firstly, a phrase-level LDA model is designed to extract topic distributions of reviews by using textual information. Secondly, making full use of structural information, we constructs a topic propagation network, and propagate topic distributions in this heterogeneous network. Finally, by setting review as the sharing factor, the two parts are integrated into a uniform optimization framework. Experimental results on two real datasets demonstrate that THAM achieves significant performance improvement, compared to the state of the arts.

[1]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[2]  Ding Xiao,et al.  Coupled matrix factorization and topic modeling for aspect mining , 2018, Inf. Process. Manag..

[3]  Samuel Pecar,et al.  Towards Opinion Summarization of Customer Reviews , 2018, ACL.

[4]  Bin Wu,et al.  Aspect Mining with Rating Bias , 2016, ECML/PKDD.

[5]  Fuzhen Zhuang,et al.  Ratable Aspects over Sentiments: Predicting Ratings for Unrated Reviews , 2014, 2014 IEEE International Conference on Data Mining.

[6]  Hao Wang,et al.  A Sentiment-aligned Topic Model for Product Aspect Rating Prediction , 2014, EMNLP.

[7]  Dongjin Yu,et al.  Rating prediction using review texts with underlying sentiments , 2017, Inf. Process. Lett..

[8]  Flavius Frasincar,et al.  Supervised and Unsupervised Aspect Category Detection for Sentiment Analysis with Co-occurrence Data , 2018, IEEE Transactions on Cybernetics.

[9]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[10]  Fuzhen Zhuang,et al.  QPLSA: Utilizing quad-tuples for aspect identification and rating , 2015, Inf. Process. Manag..

[11]  Bing Liu,et al.  Aspect Based Recommendations: Recommending Items with the Most Valuable Aspects Based on User Reviews , 2017, KDD.

[12]  Yizhou Sun,et al.  RankClus: integrating clustering with ranking for heterogeneous information network analysis , 2009, EDBT '09.

[13]  Arjun Mukherjee,et al.  Aspect opinion expression and rating prediction via LDA–CRF hybrid , 2018, Natural Language Engineering.

[14]  Martin Ester,et al.  The FLDA model for aspect-based opinion mining: addressing the cold start problem , 2013, WWW.

[15]  Richang Hong,et al.  Generative Models for Mining Latent Aspects and Their Ratings from Short Reviews , 2015, 2015 IEEE International Conference on Data Mining.

[16]  Bing Liu,et al.  Aspect and Entity Extraction for Opinion Mining , 2014 .