Evaluation and Extension of a Polarity Lexicon for German
暂无分享,去创建一个
We have manually curated a polarity lexicon for German, comprising word polarities and polarity strength values of about 8,000 words: nouns, verbs and adjectives. The decisions were primarily carried out using the synsets from GermaNet, a WordNet-like lexical database. In an evaluation on German novels, it turned out that the stock of adjectives was too small. We carried out experiments to automatically learn new subjective adjectives together with their polarity orientation and polarity strength. For this purpose, we applied a corpus-based approach that works with pairs of coordinated adjectives extracted from a large German newspaper corpus. In the context of this work, we evaluated two subtasks in detail. First, how good are we at reproducing the polarity classification – including our three- level strength measure – contained in our initial lexicon by machine learning methods. Second, because adding of training material did not improve the results at the expected rate, we evaluated the human intercoder agreement on polarity classifications in an experiment. The results show that judgements about the strength of polarity do vary considerably between different persons. Given these problems related to the design and automatic augmentation of polarity lexicons, we have successfully experimented with a semi-automatically approach where a list of reliable candidate words (here: adjectives) is generated to ease the manual annotation process.