Building Language Models with Fuzzy Weights

Word2Vec is a recently developed tool for building neural network language models. The purpose of this work is to propose an  improvement to Word2Vec by adding fuzzy weights related to the distances in the context to use more information than the way adopted in the original linear bag-of-word structure. In Word2Vec, the same weights are given regardless of different distances between words. We consider that word distances in the context bear certain semantic sense which can be exploited to reinforce connections more effectively for the network model. In order to formalize the influence of different distances in the context, we adopt Gaussian functions to represent fuzzy weights which take part in the training of the connections of network.Various experiments show that our proposed improvement can result in better language models than Word2Vec.