Design and Implementation of Chinese Spam Review Detection System

This paper designs and implements a Chinese spam review detection system based on rules. The main rules include the following three types: (1) Calculating the similarity between two comments, i.e., if the similarity is larger than a specified threshold, the two comments are viewed as review spam; (2) Calculating the correlation degree between comments and the product, i.e., if the degree is smaller than a specified threshold, the comment is viewed as review spam. (3) Detecting whether stuffing exists in the keyword, meta field or keywords of the web page. If they exist, the comments are viewed as review spam. In addition, we proposed a Naive Bayes Classifier in the review detection system. We selected 500 comments randomly and signed the comments true or false manually. Then 400 comments were selected to train and the other 100 comments were used to test. Finally, precision of the algorithm was attained. Experimental results show that the operation effect of our system is satisfactory.