An Efficient Character-Level and Word-Level Feature Fusion Method for Chinese Text Classification

In order to extract semantic feature information between texts more efficiently and reduce the effect of text representation on classification results, we propose a features fusion model C_BiGRU_ATT based on deep learning. The core task of our model is to extract the context information and local information of the text using Convolutional Neural Network(CNN) and Attention-based Bidirectional Gated Recurrent Unit(BiGRU) at character-level and word-level. Our experimental results show that the classification accuracies of C_BiGRU_ATT reach 95.55% and 95.60% on two Chinese datasets THUCNews and WangYi respectively. Meanwhile, compared with the single model based on character-level and word-level for CNN, the classification accuracies of C_BiGRU_ATT is increased by 1.6%, 2.7% on the THUCNews, and is increased by 0.6%, 5.2% on the WangYi. The results show that the proposed model C_BiGRU_ATT can extract text features more effectively.