论文信息 - Empath: A framework for evaluating entity-level sentiment analysis

Empath: A framework for evaluating entity-level sentiment analysis

Sentiment analysis is the fundamental component in text-driven monitoring or forecasting systems, where the general sentiment towards real-world entities (e.g., people, products, organizations) are analyzed based on the sentiment signals embedded in a myriad of web text available today. Building such systems involves several practically important problems, from data cleansing (e.g., boilerplate removal, web-spam detection), and sentiment analysis at individual mention-level (e.g., phrase, sentence-, document-level) to the aggregation of sentiment for each entity-level (e.g., person, company) analysis. Most previous research in sentiment analysis however, has focused only on individual mention-level analysis, and there has been relatively less work that copes with other practically important problems for enabling a large-scale sentiment monitoring system. In this paper, we propose Empath, a new framework for evaluating entity-level sentiment analysis. Empath leverages objective measurements of entities in various domains such as people, companies, countries, movies, and sports, to facilitate entity-level sentiment analysis and tracking. We demonstrate the utility of Empath for the evaluation of a large-scale sentiment system by applying it to various lexicons using Lydia, our own large scale text-analytics tool, over a corpus consisting of more than a terabyte of newspaper data. We expect that Empath will encourage research that encompasses end-to-end pipelines to enable a large-scale text-driven monitoring and forecasting systems.

[1] Moshe Koppel,et al. Good News or Bad News? Let the Market Decide , 2006, Computing Attitude and Affect in Text.

[2] Hong Yu,et al. Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[3] Bing Liu,et al. Mining and summarizing customer reviews , 2004, KDD.

[4] Soo-Min Kim,et al. Automatic Detection of Opinion Bearing Words and Sentences , 2005, IJCNLP.

[5] Claire Cardie,et al. Adapting a Polarity Lexicon using Integer Linear Programming for Domain-Specific Sentiment Classification , 2009, EMNLP.

[6] Bo Pang,et al. Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[7] Paul Resnick,et al. The value of reputation on eBay: A controlled experiment , 2002 .

[8] Hsin-Hsi Chen,et al. Test Collection Selection and Gold Standard Generation for a Multiply-Annotated Opinion Corpus , 2007, ACL.

[9] David D. Jensen,et al. Mining of Concurrent Text and Time Series , 2008 .

[10] Fernando Pereira,et al. Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis , 2008, COLING.

[11] Steven Skiena,et al. Only Fifteen Minutes? The Social Immobility of Fame in English-Language Newspapers , 2011 .

[12] Steven Skiena,et al. Lydia: A System for Large-Scale News Analysis , 2005, SPIRE.

[13] Steven Skiena,et al. The Wisdom of Bookies? Sentiment Analysis Versus. the NFL Point Spread , 2010, ICWSM.

[14] Steven Skiena,et al. Access: news and blog analysis for the social sciences , 2010, WWW '10.

[15] Steven Skiena,et al. Trading Strategies to Exploit Blog and News Sentiment , 2010, ICWSM.

[16] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[17] Cynthia Whissell,et al. THE DICTIONARY OF AFFECT IN LANGUAGE , 1989 .

[18] Arun Sundararajan,et al. Opinion Mining using Econometrics: A Case Study on Reputation Systems , 2007, ACL.

[19] Janyce Wiebe,et al. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[20] M. Bradley,et al. Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings , 1999 .

[21] Andrea Esuli,et al. Determining Term Subjectivity and Term Orientation for Opinion Mining , 2006, EACL.

[22] David M. Pennock,et al. Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[23] Claire Cardie,et al. Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[24] Paul A. Pavlou,et al. Can online reviews reveal a product's true quality?: empirical findings and analytical modeling of Online word-of-mouth communication , 2006, EC '06.

[25] Andrea Esuli,et al. SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[26] Steven Skiena,et al. Improving Movie Gross Prediction through News Analysis , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[27] Khurshid Ahmad,et al. Sentiment Analysis and the Use of Extrinsic Datasets in Evaluation , 2008, LREC.

[28] Xiaoyan Zhu,et al. Movie review mining and summarization , 2006, CIKM '06.

[29] Bing Liu,et al. Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[30] Steven Skiena,et al. Large-Scale Sentiment Analysis for News and Blogs (system demonstration) , 2007, ICWSM.