Recognition of Internet Word Based on Conditional Random Fields

Different from the formal words, Internet words are used in the Internet, which often embody emotional coloring. Internet words are the basis of the analysis of public opinion. This paper presents an algorithm to recognize Internet words based on conditional random fields by optimizing the feature sets. Combining the characteristics of Internet words, the proposed algorithm mines the context information to construct rules to process the result of the model. The algorithm is experimentally evaluated on real corpus, and the precision rate reaches 89.87%, the recalling rate 84.73%. The results show that the algorithm is efficient.