HWE: Hybrid Word Embeddings For Text Classification

Text classification is one of the most important tasks in natural language processing and information retrieval due to the increasing availability of documents in digital form and the ensuing need to access them in flexible ways. By assigning documents to labeled classes, text classification can reduce the search space and expedite the process of retrieving relevant documents. In this paper, we propose a novel text representation method, Hybrid Word Embeddings (HWE), which combines semantic information obtained fromWord- Net and contextual information extracted from text documents to provide concise and accurate representations of text documents. The proposed HWE method can improve the efficiency of deriving word semantics from text by taking advantage of the semantic relationships extracted from WordNet with less training corpus. Experimental study on classification of documents shows that the proposed HWE outperforms existing methods, including Doc2Vec and Word2Vec, in terms of classification accuracy, recall, precision, etc.