Sentiment Analysis : A Literature Survey

Our day-to-day life has always been influenced by what people think. Ideas and opinions of others have always affected our own opinions. The explosion of Web 2.0 has led to increased activity in Podcasting, Blogging, Tagging, Contributing to RSS, Social Bookmarking, and Social Networking. As a result there has been an eruption of interest in people to mine these vast resources of data for opinions. Sentiment Analysis or Opinion Mining is the computational treatment of opinions, sentiments and subjectivity of text. In this report, we take a look at the various challenges and applications of Sentiment Analysis. We will discuss in details various approaches to perform a computational treatment of sentiments and opinions. Various supervised or data-driven techniques to SA like Na\"ive Byes, Maximum Entropy, SVM, and Voted Perceptrons will be discussed and their strengths and drawbacks will be touched upon. We will also see a new dimension of analyzing sentiments by Cognitive Psychology mainly through the work of Janyce Wiebe, where we will see ways to detect subjectivity, perspective in narrative and understanding the discourse structure. We will also study some specific topics in Sentiment Analysis and the contemporary works in those areas.

[1]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[2]  Alok N. Choudhary,et al.  Sentiment Analysis of Conditional Sentences , 2009, EMNLP.

[3]  Alekh Agarwal Sentiment Analysis : A New Approach for Effective Use of Linguistic Knowledge and Exploiting Similarities in a Set of Documents to be Classified . , 2005 .

[4]  Marshall S. Smith,et al.  The general inquirer: A computer approach to content analysis. , 1967 .

[5]  ChenHsinchun,et al.  Text-based video content classification for online video-sharing sites , 2010 .

[6]  Alan F. Smeaton,et al.  Classifying sentiment in microblogs: is brevity an advantage? , 2010, CIKM.

[7]  Lipika Dey,et al.  Opinion mining from noisy text data , 2008, AND '08.

[8]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[9]  Diego Reforgiato Recupero,et al.  Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone , 2007, ICWSM.

[10]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[11]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[12]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[13]  Nina Wacholder,et al.  Identifying Sarcasm in Twitter: A Closer Look , 2011, ACL.

[14]  Xuanjing Huang,et al.  Mining product reviews based on shallow dependency parsing , 2009, SIGIR.

[15]  Steffen Becker,et al.  Opinion Summarization of Web Comments , 2010, ECIR.

[16]  Tianfang Yao,et al.  Combining dependency parsing with shallow semantic analysis for Chinese opinion-element relation identification , 2010, 2010 4th International Universal Communication Symposium.

[17]  Janyce Wiebe,et al.  A Computational Theory of Perspective and Reference in Narrative , 1988, ACL.

[18]  Chin-Yew Lin Training a selection function for extraction , 1999, CIKM '99.

[19]  Pushpak Bhattacharyya,et al.  C-Feel-It: A Sentiment Analyzer for Micro-blogs , 2011, ACL.

[20]  Edward Gibson,et al.  Discourse coherence and pronoun resolution , 2004 .

[21]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[22]  Tao Li,et al.  A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge , 2009, ACL.

[23]  Yang Song,et al.  Improving video classification via youtube video co-watch data , 2011, SBNMA '11.

[24]  Pietro Perona,et al.  A walk through the web’s video clips , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[25]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.

[26]  Baoxin Li,et al.  YouTubeCat: Learning to categorize wild web videos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[28]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[29]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[30]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[31]  Himabindu Lakkaraju,et al.  Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments , 2011, SDM.

[32]  David A. Shamma,et al.  Knowing funny: genre perception and categorization in social video sharing , 2011, CHI.

[33]  Ani Nenkova,et al.  Automatic sense prediction for implicit discourse relations in text , 2009, ACL.

[34]  Rainer Stiefelhagen,et al.  Content-based video genre classification using multiple cues , 2010, AIEMPro '10.

[35]  Yulan He,et al.  A Comparative Study of Bayesian Models for Unsupervised Sentiment Detection , 2010, CoNLL.

[36]  J. Wiebe,et al.  Discourse-level relations for opinion analysis , 2010 .

[37]  Maite Taboada,et al.  Not All Words Are Created Equal: Extracting Semantic Orientation as a Function of Adjective Relevance , 2007, Australian Conference on Artificial Intelligence.

[38]  Keith B. Hall,et al.  Improved video categorization from text metadata and user comments , 2011, SIGIR '11.

[39]  Siddharth Patwardhan,et al.  Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatednes , 2003 .

[40]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[41]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[43]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[44]  M. Rey Improving summarization through rhetorical parsing tuning , 1998 .

[45]  Craig MacDonald,et al.  Expertise drift and query expansion in expert search , 2007, CIKM '07.

[46]  Kentaro Inui,et al.  Dependency Tree-based Sentiment Classification using CRFs with Hidden Variables , 2010, NAACL.

[47]  Wei Dai,et al.  Joint categorization of queries and clips for web-based video search , 2006, MIR '06.

[48]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[49]  Janyce Wiebe,et al.  Tracking Point of View in Narrative , 1994, Comput. Linguistics.

[50]  Heiner Stuckenschmidt,et al.  Fine-Grained Sentiment Analysis with Structural Features , 2011, IJCNLP.

[51]  Pushpak Bhattacharyya,et al.  Robust Sense-based Sentiment Classification , 2011, WASSA@ACL.

[52]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[53]  Ian H. Witten,et al.  A knowledge-based search engine powered by wikipedia , 2007, CIKM '07.

[54]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[55]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[56]  Matthew Stone,et al.  Discourse Relations: A Structural and Presuppositional Account Using Lexicalised TAG , 1999, ACL.

[57]  Daniel Marcu,et al.  Sentence Level Discourse Parsing using Syntactic and Lexical Information , 2003, NAACL.

[58]  Yongdong Zhang,et al.  Google challenge: incremental-learning for web video categorization on robust semantic feature space , 2009, ACM Multimedia.

[59]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[60]  Hsinchun Chen,et al.  Text‐based video content classification for online video‐sharing sites , 2010, J. Assoc. Inf. Sci. Technol..

[61]  Tobias Günther,et al.  Sentiment Analysis of Microblogs , 2013 .

[62]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[63]  Daniel S. Weld,et al.  Automatically refining the wikipedia infobox ontology , 2008, WWW.

[64]  Jason Baldridge,et al.  Discourse Connective Argument Identification with Connective Specific Rankers , 2008, 2008 IEEE International Conference on Semantic Computing.

[65]  Alistair Kennedy,et al.  Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters , 2005 .

[66]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing , 2000 .

[67]  C. Osgood,et al.  The Pollyanna hypothesis. , 1969 .

[68]  J. Wiebe,et al.  References in Narrative Text , 1991 .

[69]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[70]  Pushpak Bhattacharyya,et al.  Harnessing WordNet Senses for Supervised Sentiment Classification , 2011, EMNLP.

[71]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[72]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[73]  Nicholas Asher,et al.  Distilling Opinion in Discourse: A Preliminary Study , 2008, COLING.

[74]  Kimberly D. Voll,et al.  Extracting sentiment as a function of discourse structure and topicality , 2008 .

[75]  Gao Cong,et al.  Content-enriched classifier for web video classification , 2010, SIGIR.

[76]  Shlomo Argamon,et al.  Using appraisal groups for sentiment analysis , 2005, CIKM '05.

[77]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[78]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[79]  Guodong Zhou,et al.  Topic-Driven Multi-document Summarization , 2010, 2010 International Conference on Asian Language Processing.

[80]  WolfFlorian,et al.  Representing Discourse Coherence: A Corpus-Based Study , 2005 .

[81]  Vincent Ng,et al.  Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews , 2006, ACL.

[82]  Jerry R. Hobbs,et al.  The Coherence of Incoherent Discourse , 1985 .

[83]  Juho Rousu,et al.  Efficient Computation of Gapped Substring Kernels on Large Alphabets , 2005, J. Mach. Learn. Res..

[84]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[85]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[86]  Xuanjing Huang,et al.  Phrase Dependency Parsing for Opinion Mining , 2009, EMNLP.

[87]  Ralph Grishman,et al.  Adaptive Information Extraction and Sublanguage Analysis , 2001 .

[88]  Rafal A. Angryk,et al.  Measuring semantic similarity using wordnet-based context vectors , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[89]  Philip Resnik,et al.  Disambiguating Noun Groupings with Respect to Wordnet Senses , 1995, VLC@ACL.

[90]  John Murphy,et al.  Using WordNet as a Knowledge Base for Measuring Semantic Similarity between Words , 1994 .

[91]  Carlotta Domeniconi,et al.  Building semantic kernels for text classification using wikipedia , 2008, KDD.

[92]  Hong Yu,et al.  Identifying discourse connectives in biomedical text. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[93]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[94]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[95]  Iryna Gurevych,et al.  Using Wikipedia and Wiktionary in Domain-Specific Information Retrieval , 2008, CLEF.

[96]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[97]  Janyce Wiebe Identifying Subjective Characters in Narrative , 1990, COLING.

[98]  James Pustejovsky,et al.  Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources , 2006, SIGDIAL Workshop.

[99]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[100]  Chong-Wah Ngo,et al.  Boosting web video categorization with contextual information from social web , 2012, World Wide Web.

[101]  Vincent Ng,et al.  Topic-wise, Sentiment-wise, or Otherwise? Identifying the Hidden Dimension for Unsupervised Text Classification , 2009, EMNLP.

[102]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[103]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.

[104]  Edward Gibson,et al.  Representing Discourse Coherence: A Corpus-Based Study , 2005, CL.

[105]  Alessandro Giuliani,et al.  Experimenting Text Summarization Techniques for Contextual Advertising , 2011, IIR.

[106]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[107]  Dianne P. O'Leary,et al.  Text summarization via hidden Markov models , 2001, SIGIR '01.

[108]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[109]  Markus Koch,et al.  TubeFiler: an automatic web video categorizer , 2009, ACM Multimedia.