Method for Chinese short text classification based on feature extension

In this paper,based on the characteristics that short texts describe weak signals,a method based on feature extension(STCFE) was introduced to classify Chinese short texts.In this method,the correlation rules between feature items of training set and testing set were mined by FP-Growth algorithm,and then those rules were applied to extend the features of the testing set.Meanwhile,to classify Chinese short texts effectively,semantic information was introduced and the DEF term formula of words was improved in HowNet.Experimental results show that the proposed method performs well,and its Micro-F1 and Macro-F1 are higher than those of conventional approaches.