Performance and trends in recent opinion retrieval techniques

This paper presents trends and performance of opinion retrieval techniques proposed within the last 8 years. We identify major techniques in opinion retrieval and group them into four popular categories. We describe the state-of-the-art techniques for each category and emphasize on their performance and limitations. We then summarize with a performance comparison table for the techniques on different datasets. Finally, we highlight possible future research directions that can help solve existing challenges in opinion retrieval.

[1]  Hiroshi Kanayama,et al.  Fully Automatic Lexicon Expansion for Domain-oriented Sentiment Analysis , 2006, EMNLP.

[2]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[3]  Olga Vechtomova Facet-based opinion retrieval from blogs , 2010, Inf. Process. Manag..

[4]  Iryna Gurevych,et al.  Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary , 2008, LREC.

[5]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[6]  Maite Taboada,et al.  Genre-Based Paragraph Classification for Sentiment Analysis , 2009, SIGDIAL Conference.

[7]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[8]  Jonathan Chan,et al.  Computer applications , 1986 .

[9]  Michael Gamon,et al.  Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis , 2004, COLING.

[10]  Craig MacDonald,et al.  University of Glasgow at TREC 2009: Experiments with Terrier , 2009, TREC.

[11]  Rohini K. Srihari,et al.  OpinionMiner: a novel machine learning system for web opinion mining and extraction , 2009, KDD.

[12]  Bing Liu,et al.  Identifying comparative sentences in text documents , 2006, SIGIR.

[13]  Jin-Cheon Na,et al.  Sentiment analysis of movie reviews on discussion boards using a linguistic approach , 2009, CIKM 2009.

[14]  Jeremy Pickens,et al.  Term context models for information retrieval , 2006, CIKM '06.

[15]  Peng Jiang,et al.  Blog Opinion Retrieval Based on Topic-Opinion Mixture Model , 2010, PAKDD.

[16]  Jung-Tae Lee,et al.  High precision opinion retrieval using sentiment-relevance flows , 2010, SIGIR '10.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Iadh Ounis,et al.  University of Glasgow at TREC 2006: Experiments in Terabyte and Enterprise Tracks with Terrier , 2006, TREC.

[19]  Naoaki Okazaki,et al.  Opinion classification with tree kernel SVM using linguistic modality analysis , 2009, CIKM.

[20]  Ronald R. Yager,et al.  WebPET: An Online Tool for Lexicographic Decision Making , 2010, IEEE Intelligent Systems.

[21]  Wei Zhang,et al.  Recognition and classification of noun phrases in queries for effective retrieval , 2007, CIKM '07.

[22]  Clement T. Yu,et al.  An effective approach to document retrieval via utilizing WordNet and recognizing phrases , 2004, SIGIR '04.

[23]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[24]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[25]  Yue Liu,et al.  ICTNET at Blog Track TREC 2009 , 2009, TREC.

[26]  Songbo Tan,et al.  An Iterative Reinforcement Approach for Fine-Grained Opinion Mining , 2009, NAACL.

[27]  Songbo Tan,et al.  Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples , 2008, SIGIR '08.

[28]  Min Zhang,et al.  A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval , 2008, SIGIR '08.

[29]  Wei-Ying Ma,et al.  Improving pseudo-relevance feedback in web information retrieval using web page segmentation , 2003, WWW '03.

[30]  Jong-Hyeok Lee,et al.  Improving Opinion Retrieval Based on Query-Specific Sentiment Lexicon , 2009, ECIR.

[31]  Alan F. Smeaton,et al.  Topic-dependent sentiment analysis of financial blogs , 2009, TSA@CIKM.

[32]  Reza Zafarani,et al.  Sentiment Propagation in Social Networks: A Case Study in LiveJournal , 2010, SBP.

[33]  Alok N. Choudhary,et al.  Sentiment Analysis of Conditional Sentences , 2009, EMNLP.

[34]  Claire Cardie,et al.  Multi-aspect Sentiment Analysis with Topic Models , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[35]  Yong Yu,et al.  Web-scale classification with naive bayes , 2009, WWW '09.

[36]  Wolfgang Nejdl,et al.  How useful are your comments?: analyzing and predicting youtube comments and comment ratings , 2010, WWW '10.

[37]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[38]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[39]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[40]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[41]  Ee-Peng Lim,et al.  Detecting product review spammers using rating behaviors , 2010, CIKM.

[42]  Xiaohui Yu,et al.  A quality-aware model for sales prediction using reviews , 2010, WWW '10.

[43]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[44]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[45]  Fabio Crestani,et al.  Investigating Learning Approaches for Blog Post Opinion Retrieval , 2009, ECIR.

[46]  Jaime G. Carbonell,et al.  Retrieval and Feedback Models for Blog Distillation , 2007, TREC.

[47]  Mário J. Silva,et al.  Automatic creation of a reference corpus for political opinion mining in user-generated content , 2009, TSA@CIKM.

[48]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[49]  Patricia L. Mabry,et al.  Advances in Social Computing, Third International Conference on Social Computing, Behavioral Modeling, and Prediction, SBP 2010, Bethesda, MD, USA, March 30-31, 2010. Proceedings , 2010, SBP.

[50]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[51]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[52]  Ee-Peng Lim,et al.  Web classification using support vector machine , 2002, WIDM '02.

[53]  Emiel Krahmer,et al.  What Computational Linguists Can Learn from Psychologists (and Vice Versa) , 2010, Computational Linguistics.

[54]  Wei Zhang,et al.  Opinion retrieval from blogs , 2007, CIKM '07.

[55]  Craig MacDonald,et al.  An effective statistical approach to blog post opinion retrieval , 2008, CIKM '08.

[56]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[57]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[58]  Craig MacDonald,et al.  Overview of the TREC 2006 Blog Track , 2006, TREC.

[59]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[60]  Svetha Venkatesh,et al.  Classification and Pattern Discovery of Mood in Weblogs , 2010, PAKDD.

[61]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[62]  J. Keziya Rani,et al.  Mining Opinion Features in Customer Reviews. , 2016 .

[63]  Koji Eguchi,et al.  Sentiment Retrieval using Generative Models , 2006, EMNLP.

[64]  Bing Liu,et al.  The utility of linguistic rules in opinion mining , 2007, SIGIR.

[65]  Sung-Hyon Myaeng,et al.  Domain-specific sentiment analysis using contextual feature generation , 2009, TSA@CIKM.

[66]  Jian Liu,et al.  Sentiment classification using phrase patterns , 2004, The Fourth International Conference onComputer and Information Technology, 2004. CIT '04..

[67]  Michael L. Littman,et al.  Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus , 2002, ArXiv.

[68]  Emanuele Della Valle,et al.  An Introduction to Information Retrieval , 2013 .

[69]  Huan Liu,et al.  Blogosphere: research issues, tools, and applications , 2008, SKDD.

[70]  Sean A. Munson,et al.  Presenting diverse political opinions: how and how much , 2010, CHI.

[71]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[72]  Jungi Kim,et al.  KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval , 2008, TREC.

[73]  Natasa Milic-Frayling,et al.  Improving the classification of newsgroup messages through social network analysis , 2007, CIKM '07.

[74]  Craig MacDonald,et al.  Integrating Proximity to Subjective Sentences for Blog Opinion Retrieval , 2009, ECIR.

[75]  John Carroll,et al.  Automatic Seed Word Selection for Unsupervised Sentiment Classification of Chinese Text , 2008, COLING.

[76]  Luke S. Zettlemoyer,et al.  Learning Context-Dependent Mappings from Sentences to Logical Form , 2009, ACL.

[77]  Bing Liu Sentiment Analysis , 2020 .

[78]  Stephen E. Robertson,et al.  Okapi/Keenbow at TREC-8 , 1999, TREC.

[79]  Clement T. Yu,et al.  The effect of negation on sentiment analysis and retrieval effectiveness , 2009, CIKM.

[80]  Jaime G. Carbonell,et al.  Retrieval and feedback models for blog feed search , 2008, SIGIR '08.

[81]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[82]  Craig MacDonald,et al.  University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier , 2007, TREC.

[83]  Elena Lloret,et al.  Experiments on Summary-based Opinion Classification , 2010, HLT-NAACL 2010.

[84]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[85]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[86]  Clement Yu,et al.  UIC at TREC 2008 Blog Track , 2008 .

[87]  Jong-Hyeok Lee,et al.  Mining the blogosphere for top news stories identification , 2010, SIGIR '10.

[88]  Peter Willett,et al.  Readings in information retrieval , 1997 .

[89]  Khairullah Khan,et al.  A Review of Machine Learning Algorithms for Text-Documents Classification , 2010 .

[90]  Dekang Lin,et al.  Dependency-Based Evaluation of Minipar , 2003 .

[91]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[92]  Gilad Mishne Multiple Ranking Strategies for Opinion Retrieval in Blogs - The University of Amsterdam at the 2006 TREC Blog Track , 2006, TREC.

[93]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[94]  Andrea Esuli,et al.  Automatic generation of lexical resources for opinion mining: models, algorithms and applications , 2010, SIGF.

[95]  Fabio Crestani,et al.  Proximity-based opinion retrieval , 2010, SIGIR '10.

[96]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[97]  Ophir Frieder,et al.  Repeatable evaluation of search services in dynamic environments , 2007, TOIS.

[98]  Coskun Bayrak,et al.  UALR at TREC: Blog Track , 2006, TREC.

[99]  Moshe Koppel,et al.  Good News or Bad News? Let the Market Decide , 2006, Computing Attitude and Affect in Text.

[100]  Hsin-Hsi Chen,et al.  Building Emotion Lexicon from Weblog Corpora , 2007, ACL.

[101]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[102]  Claire Cardie,et al.  OpinionFinder: A System for Subjectivity Analysis , 2005, HLT.

[103]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[104]  J Allan,et al.  Readings in information retrieval. , 1998 .

[105]  Jungi Kim,et al.  Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis , 2009, ACL/IJCNLP.

[106]  Alexander Liu,et al.  Predicting Success and Failure in Weight Loss Blogs through Natural Language Use , 2008, ICWSM.

[107]  Craig MacDonald,et al.  Blog track research at TREC , 2010, SIGF.

[108]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[109]  Johan Bollen,et al.  Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena , 2009, ICWSM.

[110]  Susan T. Dumais,et al.  Leveraging temporal dynamics of document content in relevance ranking , 2010, WSDM '10.

[111]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[112]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[113]  Xuanjing Huang,et al.  A unified relevance model for opinion retrieval , 2009, CIKM.

[114]  Alan F. Smeaton,et al.  A study of inter-annotator agreement for opinion retrieval , 2009, SIGIR.

[115]  Bing Liu,et al.  Mining Opinions in Comparative Sentences , 2008, COLING.

[116]  Victor Lavrenko Introduction to probabilistic models in IR , 2010, SIGIR '10.

[117]  Fuchun Peng,et al.  Unsupervised query segmentation using generative language models and wikipedia , 2008, WWW.

[118]  Wei-Hao Lin,et al.  Which Side are You on? Identifying Perspectives at the Document and Sentence Levels , 2006, CoNLL.

[119]  Hsinchun Chen,et al.  AI and Opinion Mining , 2010, IEEE Intelligent Systems.

[120]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[121]  Carlo Strapparava,et al.  Learning to identify emotions in text , 2008, SAC '08.

[122]  Songbo Tan,et al.  Building domain-oriented sentiment lexicon by improved information bottleneck , 2009, CIKM.