Two improved continuous bag-of-word models

Data representation is a fundamental task in machine learning, which affects the performance of the whole machine learning system. In the past few years, with the rapid development of deep learning, the models for word embedding based on neural networks have brought new inspiration to the research of natural language processing. In this paper, two kinds of schemes for improving the Continuous Bag-of-Words (CBOW) model are proposed. On one hand, the relative positions of adjacent words are taken as weights for the input layer of the model; on the other hand, the context is considered, and which can take part in the training course when the prediction of next target word is to be made. Experimental results show that our proposed models outperform the classical CBOW model.