This paper presents the CUHK opinion analysis system, namely Opinmine, for the NTCIR-6 pilot task. Opinmine comprises of three functional modules: (1) Preprocessing and Assignment Module (PAM) performs word segmentation, part-of-speech (POS) tagging and named entity recognition on the input Chinese text. It is based on lexicalized Hidden Markov Model and heuristic rules. (2) Knowledge Acquisition Module (KAM) applies unsupervised learning techniques to acquire different opinion knowledge including opinion operator, opinion indicator and opinion words from annotated data and Web data. (3) Sentence Analysis Module (SAM) analyzes each input sentence to determine whether it is opinionated. For each opinionated sentence, its opinion holders, opinion operators and opinion words are recognized and its polarity is determined. Furthermore, the relevance between the sentence and a topic are judged by based on sentence-topic and document-topic relevance. For lenient evaluation, the F1 performance of Opinmine in opinion extraction, polarity decision and relevance judgment are 0.635, 0.405 and 0.812, respectively; and for strict evaluation, the F1 performances are 0.427, 0.296 and 0.616, respectively.
[1]
R. C. Williamson,et al.
Support vector regression with automatic accuracy control.
,
1998
.
[2]
Eduard Hovy,et al.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text
,
2006
.
[3]
Shiwen Yu,et al.
A Unicode-based Adaptive Segmenter
,
2004,
J. Chin. Lang. Comput..
[4]
Hsin-Hsi Chen,et al.
Construction of an Evaluation Corpus for Opinion Extraction
,
2005,
NTCIR.
[5]
Vasileios Hatzivassiloglou,et al.
Predicting the Semantic Orientation of Adjectives
,
1997,
ACL.
[6]
Dragomir R. Radev,et al.
MEAD ReDUCs: Michigan at DUC 2003
,
2003
.
[7]
Guohong Fu,et al.
Chinese POS Disambiguation and Unknown Word Guessing with Lexicalized HMMs
,
2006,
Int. J. Technol. Hum. Interact..
[8]
Hsin-Hsi Chen,et al.
Overview of Opinion Analysis Pilot Task at NTCIR-6
,
2007,
NTCIR.
[9]
Hsin-Hsi Chen,et al.
Major topic detection and its application to opinion summarization
,
2005,
SIGIR '05.