Polarity analysis of micro reviews in foursquare

On Foursquare, one of the currently most popular location-based social networks, users can not only share which places (venues) they visit but also leave short comments (tips) about their previous experiences at specific venues. Tips may provide a valuable feedback for business owners as well as for potential new customers. Sentiment or polarity classification provides useful tools for opinion summarization, which can help both parties to quickly obtain a predominant view of the opinions posted by users at a specific venue. We here present what, to our knowledge, is the first study of polarity of Foursquare tips. We start by characterizing two datasets of collected tips with respect to their textual content. Some inherent characteristics of tips, such as short sizes as well as informal and often noisy content, pose great challenges to polarity detection. We then investigate the effectiveness of four alternative polarity classification strategies on subsets of our dataset. Three of the considered strategies are based on supervised machine learning techniques and the fourth one is an unsupervised lexicon-based approach. Our evaluation indicates that effective polarity classification can be achieved even if the simpler lexicon-based approach, which does not require costly manual tip labeling, is adopted.

[1]  George Papadakis,et al.  Content vs. context for sentiment analysis: a comparative analysis over microblogs , 2012, HT '12.

[2]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[3]  Virgílio A. F. Almeida,et al.  From bias to opinion: a transfer-learning approach to real-time sentiment analysis , 2011, KDD.

[4]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[5]  Yue Lu,et al.  Automatic construction of a context-aware sentiment lexicon: an optimization approach , 2011, WWW.

[6]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[7]  Harry Zhang,et al.  Exploring Conditions For The Optimality Of Naïve Bayes , 2005, Int. J. Pattern Recognit. Artif. Intell..

[8]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[9]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[10]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[11]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[13]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[14]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[15]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[16]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[17]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[18]  Mike Thelwall,et al.  Twitter, MySpace, Digg: Unsupervised Sentiment Analysis in Social Media , 2012, TIST.

[19]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[20]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[21]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[22]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.