CLOpinionMiner: Opinion Target Extraction in a Cross-Language Scenario

Opinion target extraction is a subtask of opinion mining which is very useful in many applications. The problem has usually been solved by training a sequence labeler on manually labeled data. However, the labeled training datasets are imbalanced in different languages, and the lack of labeled corpus in a language limits the research progress on opinion target extraction in this language. In order to address the above problem, we propose a novel system called CLOpinionMiner which investigates leveraging the rich labeled data in a source language for opinion target extraction in a different target language. In this study, we focus on English-to-Chinese cross-language opinion target extraction. Based on the English dataset, our method produces two Chinese training datasets with different features. Two labeling models for Chinese opinion target extraction are trained based on Conditional Random Fields (CRF). After that, we use a monolingual co-training algorithm to improve the performance of both models by leveraging the enormous unlabeled Chinese review texts on the web. Experimental results show the effectiveness of our proposed approach.

[1]  Iryna Gurevych,et al.  Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields , 2010, EMNLP.

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Wendy G. Lehnert,et al.  Information extraction , 1996, CACM.

[4]  Xiaojun Wan,et al.  A Comparative Study of Cross-Lingual Sentiment Classification , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[5]  Jong-Hoon Oh,et al.  Bilingual Co-Training for Monolingual Hyponymy-Relation Acquisition , 2009, ACL.

[6]  Gregory Grefenstette The Problem of Cross-Language Information Retrieval , 1998 .

[7]  Timothy W. Finin,et al.  Delta TFIDF: An Improved Feature Space for Sentiment Analysis , 2009, ICWSM.

[8]  Claire Cardie,et al.  Joint Inference for Fine-grained Opinion Extraction , 2013, ACL.

[9]  Rada Mihalcea,et al.  Co-training and Self-training for Word Sense Disambiguation , 2004, CoNLL.

[10]  Chun Chen,et al.  Opinion Word Expansion and Target Extraction through Double Propagation , 2011, CL.

[11]  Kam-Fai Wong,et al.  WIA-Opinmine System in NTCIR-8 MOAT Evaluation , 2010, NTCIR.

[12]  Xiaojun Wan,et al.  Co-Training for Cross-Lingual Sentiment Classification , 2009, ACL.

[13]  Claire Cardie,et al.  Identifying Expressions of Opinion in Context , 2007, IJCAI.

[14]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[15]  Ferenc Szidarovszky,et al.  A Simple Ensemble Method for Hedge Identification , 2010, CoNLL Shared Task.

[16]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[17]  Bing Liu,et al.  Sentiment Analysis and Subjectivity , 2010, Handbook of Natural Language Processing.

[18]  Guodong Zhou,et al.  Opinion Target Extraction Using a Shallow Semantic Parsing Framework , 2012, AAAI.

[19]  John Blitzer,et al.  Co-Training for Domain Adaptation , 2011, NIPS.

[20]  Houfeng Wang,et al.  Cross-Lingual Mixture Model for Sentiment Classification , 2012, ACL.

[21]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[22]  Mark W. Davis,et al.  Getting Information from Documents You Cannot Read: An Interactive Cross-Language Text Retrieval and Summarization System , 1999 .

[23]  Eduard Hovy,et al.  Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text , 2006 .

[24]  Daniel Jurafsky,et al.  Discriminative Reordering with Chinese Grammatical Relations Features , 2009, SSST@HLT-NAACL.

[25]  Razvan C. Bunescu,et al.  Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques , 2003, Third IEEE International Conference on Data Mining.

[26]  Jun Zhao,et al.  Opinion Target Extraction Using Word-Based Translation Model , 2012, EMNLP.

[27]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[28]  Rada Mihalcea,et al.  Learning Multilingual Subjective Language via Cross-Lingual Projections , 2007, ACL.

[29]  Xiaojun Wan,et al.  Cross-Language Opinion Target Extraction in Review Texts , 2012, 2012 IEEE 12th International Conference on Data Mining.

[30]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[31]  Gary Geunbae Lee,et al.  A Cross-lingual Annotation Projection Approach for Relation Detection , 2010, COLING.

[32]  Gregory Grefenstette,et al.  Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.

[33]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[34]  Tao Li,et al.  A Non-negative Matrix Tri-factorization Approach to Sentiment Classification with Lexical Prior Knowledge , 2009, ACL.

[35]  Nigel Collier,et al.  Sentiment Analysis using Support Vector Machines with Diverse Information Sources , 2004, EMNLP.

[36]  Gérard Dray,et al.  Opinion Mining From Blogs , 2009, CISIM 2009.

[37]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[38]  Claire Cardie,et al.  Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora , 2011, ACL.

[39]  Imed Zitouni,et al.  Cross-Language Information Propagation for Arabic Mention Detection , 2009, TALIP.

[40]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.