Incorporate the Syntactic Knowledge in Opinion Mining in User-generated Content

With the development of the accessibly to the Internet, the content of the Web is now being changed. User-generated Content (UGC), such a kind of novel media content produced by end-users, has taken off in past few years with the revolution of Web 2.0 and its flourish is especially impressive in China. The adoption of UGC has been proven to be beneficial to numbers of traditional tasks. However, the dramatic increase in the volume of such data prevents users from utilizing in a manual way and thus automatic mining approaches are demanded. Opinion mining, a recent data mining technique at the crossroad of information retrieval and computational linguistics, is pretty suitable for this kind of information processing. In our paper, we dedicate our work to the main two subtasks of opinion mining: topic extraction and sentiment classification. We propose approaches to these two issues respectively for Chinese based on the consideration of syntactic knowledge. We take the blog data, which is a typical application of UGC, as the evaluating data in our experiments and the results show that our approaches to the two tasks are promising. We also give an introduction to our future plans stemmed from the work done in this paper: an intelligent advertisement placement system in UGC.

[1]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[2]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[3]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[4]  Changning Huang,et al.  An Efficient Syntactic Tagging Tool For Corpora , 1994, COLING.

[5]  Yohei Seki Opinion holder extraction from author and authority viewpoints , 2007, SIGIR.

[6]  Kamal Nigam,et al.  Towards a Robust Metric of Opinion , 2004 .

[7]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[8]  Andrea Esuli,et al.  Determining the semantic orientation of terms through gloss classification , 2005, CIKM '05.

[9]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[10]  Chun Chen,et al.  Extracting opinion topics for Chinese opinions using dependence grammar , 2007, ADKDD '07.

[11]  Alaa A. Kharbouch,et al.  Three models for the description of language , 1956, IRE Trans. Inf. Theory.

[12]  Daniel Dominic Sleator,et al.  Parsing English with a Link Grammar , 1995, IWPT.

[13]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[14]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[15]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[16]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[17]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[18]  Bing Liu,et al.  The utility of linguistic rules in opinion mining , 2007, SIGIR.

[19]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[20]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.