Classification of e-government documents based on cooperative expression of word vectors

The effective document classification is a powerful technique to deal with the huge amount of e-government documents automatically instead of accomplishing them manually. The word-to-vector (word2vec) model, which converts semantic word into low-dimensional vectors, could be successfully employed to classify the e-government documents. In this paper, we propose the cooperative expressions of word vector (Co-word-vector), whose multi-granularity of integration explores the possibility of modeling documents in the semantic space. Meanwhile, we also aim to improve the weighted continuous bag of words model based on word2vec model and distributed representation of topic-words based on LDA model. Furthermore, combining the two levels of word representation, performance result shows that our proposed method on the e-government document classification outperform than the traditional method.