A model of Korean sentence similarity measurement using sense-based morpheme embedding and RNN sentence encoding

Sentence similarity measurement is an important technology that can be apply to various applications in the natural language processing. Recently, an encoder-decoder model using recurrent neural network (RNN) has achieved remarkable results. This paper proposes a model for measuring Korean sentence similarity based on sense-based morpheme embedding and gated recurrent units (GRU) encoder. We evaluate the measurement model consist of experimentally optimized morpheme embedding and sentence encoding models. In the measurement of sentence similarity, the proposed model encoded using the pre-trained morpheme embedding improves the performance compared with the character-embedding model. In addition, it can be used effectively in the question and answering (Q&A) system.