Opinmine - Opinion Analysis System by CUHK for NTCIR-6 Pilot Task

This paper presents the CUHK opinion analysis system, namely Opinmine, for the NTCIR-6 pilot task. Opinmine comprises of three functional modules: (1) Preprocessing and Assignment Module (PAM) performs word segmentation, part-of-speech (POS) tagging and named entity recognition on the input Chinese text. It is based on lexicalized Hidden Markov Model and heuristic rules. (2) Knowledge Acquisition Module (KAM) applies unsupervised learning techniques to acquire different opinion knowledge including opinion operator, opinion indicator and opinion words from annotated data and Web data. (3) Sentence Analysis Module (SAM) analyzes each input sentence to determine whether it is opinionated. For each opinionated sentence, its opinion holders, opinion operators and opinion words are recognized and its polarity is determined. Furthermore, the relevance between the sentence and a topic are judged by based on sentence-topic and document-topic relevance. For lenient evaluation, the F1 performance of Opinmine in opinion extraction, polarity decision and relevance judgment are 0.635, 0.405 and 0.812, respectively; and for strict evaluation, the F1 performances are 0.427, 0.296 and 0.616, respectively.