한국어 감성 사전 DecoSelex 구축을 위한 영어 SentiWordNet 활용 및 보완 논의
暂无分享,去创建一个
In this study, we present the Korean Sentiment Lexicon DecoSelex that we constructed for the sentiment analysis of userᐨgenerated texts. This study starts with an examination of the results of Google translation of English SWN into Korean sentiment word lists. Among these Korean words, nonᐨtranslated words or irrelevant ones are observed, and therefore, after the manual elimination of these inappropriate terms, we obtained 3,665 candidates for Korean sentiment words. However, the nouns, being important in number in this list, are mostly medical disease names, and the adjectives, the most significant part of speech in general in the sentiment lexicon, appear extremely small in number. This situation led us to examine the DECO Korean electronic dictionary, a more sizable and reliable electronic resource for Korean. We obtained 35,452 sentimentᐨrelated words based on the DECO entries and classified them into several subᐨclasses according to their polarity properties and psychological meanings. For evaluating our sentiment lexicon DecoSelex, a dataset of 1,200 review texts of IT products was used. The current version of DecoSelex showed 83.6% precision & 83.2% recall for the positive words, and 72.9% precision & 69.1% recall for the negative words.