论文信息 - Opinion integration and summarization

Opinion integration and summarization

As Web 2.0 applications become increasingly popular, more and more people express their opinions on the Web in various ways in real time. Such wide coverage of topics and abundance of users make the Web an extremely valuable source for mining people’s opinions about all kinds of topics. However, since the opinions are usually expressed as unstructured text scattered in different sources, it is still difficult for the users to digest all opinions relevant to a specific topic with the current technologies. This thesis focuses on the problem of opinion integration and summarization whose goal is to better support user digestion of huge amounts of opinions for an arbitrary topic. To systematically study this problem, we have identified three important dimensions of opinion analysis: separation of aspects (or subtopics) of opinions, understanding of sentiments, and assessment of quality of opinions. These dimensions form three key components in an integrated opinion summarization system. Accordingly, this thesis makes contributions in proposing novel and general computational techniques for three synergistic tasks: (1) integrating relevant opinions from all kinds of Web 2.0 sources and organizing them along different aspects of the topic which not only serves as a semantic grouping of opinions but also facilitates user navigation into the huge opinion space; (2) inferring the sentiments in the opinions with respect to different aspects and different opinion holders, so as to provide the users with a more detailed and informed multi-perspective view of the opinions; and (3) improving the prediction of opinion quality which critically decides the usefulness of the information extracted from the opinions. We focus on general and robust methods which require minimal human supervision so as to make the automated methods applicable to a wide range of topics and scalable to large amounts of opinions. This focus differentiates this thesis from work that is fine-tuned or welltrained for particular domains but are not easily adaptable to new domains. Our main idea is to exploit many naturally available resources, such as structured ontologies and social networks, which serve as indirect signals and guidance for generating opinion summaries. Along this line, our proposed techniques have been shown to be effective and general enough to be applied for potentially many interesting applications in multiple domains, such as business intelligence and political science.

Yue Lu | ChengXiang Zhai | ChengXiang Zhai | Yue Lu

[1] Regina Barzilay,et al. Automatically Generating Wikipedia Articles: A Structure-Aware Approach , 2009, ACL.

[2] Xiaojin Zhu,et al. --1 CONTENTS , 2006 .

[3] Michael L. Littman,et al. Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[4] Bei Yu,et al. A cross-collection mixture model for comparative text mining , 2004, KDD.

[5] Eric K. Ringger,et al. Pulse: Mining Customer Opinions from Free Text , 2005, IDA.

[6] E. Goffman. Frame analysis: An essay on the organization of experience , 1974 .

[7] Mitsuru Ishizuka,et al. SentiFul: Generating a reliable lexicon for sentiment analysis , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[8] Ivan Titov,et al. Modeling online reviews with multi-grain topic models , 2008, WWW.

[9] W. Bruce Croft,et al. An Evaluation of Techniques for Clustering Search Results , 2005 .

[10] Giuseppe Carenini,et al. Extracting knowledge from evaluative text , 2005, K-CAP '05.

[11] Ivan Titov,et al. A Joint Model of Text and Aspect Ratings for Sentiment Summarization , 2008, ACL.

[12] Bo Pang,et al. Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[13] Kathleen R. McKeown,et al. Predicting the semantic orientation of adjectives , 1997 .

[14] Raymond H. Putra,et al. Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions , 2010, COLING.

[15] Susan T. Dumais,et al. Bringing order to the Web: automatically categorizing search results , 2000, CHI.

[16] Oren Etzioni,et al. Extracting Product Features and Opinions from Reviews , 2005, HLT.

[17] Jong-Hyeok Lee,et al. Improving Opinion Retrieval Based on Query-Specific Sentiment Lexicon , 2009, ECIR.

[18] Carlos Castillo,et al. Web spam identification through content and hyperlinks , 2008, AIRWeb '08.

[19] Bing Liu,et al. Mining Opinion Features in Customer Reviews , 2004, AAAI.