Transfer Learning Method for Very Deep CNN for Text Classification and Methods for its Evaluation

In recent years, it has become possible to perform text classification with high accuracy by using convolutional neural networks (CNNs). Zhang et al. decomposed words into characters and classified texts using a CNN with relatively deep layers to obtain excellent classification results. However, it is often difficult to prepare a sufficient number of labeled samples for solving real-world text-classification problems. One method for handling this problem is transfer learning, which uses a network tuned for an arbitrary task as the initial network for a target task. While transfer learning is known to be effective for image recognition, for tasks in natural language processing, such as document classification, it has not yet been shown for what types of data and to what extent transfer learning is effective. In this paper, we first introduce a character-level CNN adopting the structure of a residual network to construct a network with deeper layers for Japanese text classification. We then demonstrate that we can improve classification accuracy by performing transfer learning between two particular datasets. Additionally, we propose an approach to evaluate the effectiveness of transfer learning and use it to evaluate our model.