Mining collective knowledge: inferring functional labels from online review for business

With the increasing popularity of online e-commerce services, a large volume of online reviews have been constantly generated by users. In this paper, we propose to study the problem of inferring functional labels using online review text. Functional labels summarize and highlight the main characteristics of a business, which can serve as bridges between the consumption needs and the service functions. We consider two kinds of semantic similarities: lexical similarity and embedding similarity, which characterize the relatedness in two different perspectives. To measure the lexical similarity, we use the classic probabilistic ranking formula, i.e., BM25; to measure the embedding similarity, we propose an extended embedding model which can incorporate weak supervised information derived from review text. These two kinds of similarities compensate each other and capture the semantic relatedness in a more comprehensive way. We construct a test collection consisting of four different domains based on a Yelp dataset and consider multiple baseline methods for comparison. Extensive experiments have shown that the proposed methods are very effective.

[1]  Regina Barzilay,et al.  Learning Document-Level Semantic Properties from Free-Text Annotations , 2008, ACL.

[2]  Xiaohui Yu,et al.  ARSA: a sentiment-aware model for predicting sales performance using blogs , 2007, SIGIR.

[3]  Xiaoming Li,et al.  Infer User Interests via Link Structure Regularization , 2014, TIST.

[4]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[5]  Claire Cardie,et al.  Identifying Expressions of Opinion in Context , 2007, IJCAI.

[6]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[7]  Panagiotis G. Ipeirotis,et al.  Show me the money!: deriving the pricing power of product features by mining consumer reviews , 2007, KDD '07.

[8]  Xiaojun Wan,et al.  CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction , 2008, COLING.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  Sivaji Bandyopadhyay,et al.  Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization , 2008 .

[11]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[12]  Zhiyuan Liu,et al.  Automatic Keyphrase Extraction via Topic Decomposition , 2010, EMNLP.

[13]  Zheng Lin,et al.  Towards jointly extracting aspects and aspect-specific sentiment knowledge , 2012, CIKM.

[14]  Hongfei Yan,et al.  Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid , 2010, EMNLP.

[15]  Yulan He,et al.  Connecting Social Media to E-Commerce: Cold-Start Product Recommendation Using Microblogging Information , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[17]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[18]  Mejari Kumar,et al.  Connecting Social Media to E-Commerce: Cold-Start Product Recommendation using Microblogging Information , 2018 .

[19]  Natalie S. Glance,et al.  Star Quality: Aggregating Reviews to Rank Products and Merchants , 2010, ICWSM.

[20]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[21]  Amélie Marian,et al.  Improving the quality of predictions using textual information in online user reviews , 2013, Inf. Syst..

[22]  B. Krämer Classification of generic places : explorations with implications for evaluation , 1995 .

[23]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[26]  Bing Liu,et al.  Opinion spam and analysis , 2008, WSDM '08.

[27]  Xiaojun Wan,et al.  Single Document Keyphrase Extraction Using Neighborhood Knowledge , 2008, AAAI.

[28]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[29]  Ken Barker,et al.  Using Noun Phrase Heads to Extract Document Keyphrases , 2000, Canadian Conference on AI.

[30]  Jian-Yun Nie,et al.  Originator or propagator?: incorporating social role theory into topic models for twitter content analysis , 2013, CIKM.

[31]  Amélie Marian,et al.  Beyond the Stars: Improving Rating Predictions using Review Text Content , 2009, WebDB.

[32]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[33]  Matthew Hurst,et al.  A Language Model Approach to Keyphrase Extraction , 2003, ACL 2003.

[34]  Yoshua Bengio,et al.  Globally Trained Handwritten Word Recognizer Using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models , 1993, NIPS.

[35]  Ruslan Salakhutdinov,et al.  A Multiplicative Model for Learning Distributed Text-Based Attribute Representations , 2014, NIPS.

[36]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[37]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[38]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.