Research on Chinese Spam Comments Detection Based on Chinese Characteristics

With the popularity of mobile Internet, people can express their views in the Internet anytime and anywhere. Personal attack speech is a kind of cyber violence that not only does great harm to others, but also makes the whole network environment become chaotic. In order to detect spam comment in Chinese network, in this paper we carefully analyze the datasets and select four characteristics from the real data. With the help of Chinese language and literature, a malicious dictionary can be obtained by extending the algorithm on the basis of a very small number of seed words. Compared with manual dictionary selection, our method saves a lot of labor costs. Besides, it performs well in natural language processing with the help of deep learning model. Finally, the experimental results show that this method can detect malicious comments in “NetEase” news effectively.