Research on Construction of Tibetan Sentiment Corpus

Sentiment classification is one of the research hot spots of Natural Language Processing. Compared with English and Chinese, it is hard for Tibetan to do some research of sentiment analysis because of the situation that we lack of related sentiment corpus. In this paper, we construct a Tibetan sentiment corpus by crawling from Tibetan website and artificial Chinese-Tibetan translation. The final corpus we build is basically reaching a experimental requirement. The corpus contains 10,134 Emotion sentences, including 2,025 artificial translation corpus, and 8109 corpus crawl through the network.