Deep learning model for unstructured knowledge classification using structural features

Automatic text classification is widely used as the basic method for analyzing data. While classification methods like the support vector machine (SVM) have exhibited impressive performance in the area, the recent use of deep learning has led to considerable progress in text classification. This study proposes a deep learning–based classification model called DEEP-I to classify information on national research and development with complex structural features, a large amount of text, and large-scale classes. In addition to the word–sentence structure of a simple document, the number of stacking layers of the deep model is increased in light of the higher-level structure of the items. Experiments on 180,000 datasets and 366 classification schemes showed that the proposed model can improve classification performance by 22.7% over the traditional SVM and 15.7% over a deep learning model that uses only structured features of word sentences. This improvement was achieved because the multi-layered stacking method was applied to enhance learning by increasing depth by five to 10 times that of the conventional deep learning model and effectively combining features of heterogeneous items. The proposed model is also applicable to datasets containing documents with complex structures.

[1]  Ching-Yu Yang,et al.  SVM-based classification method to identify alcohol consumption using ECG and PPG monitoring , 2017, Personal and Ubiquitous Computing.

[2]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.

[3]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[4]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.

[5]  Adam Kilgarriff,et al.  of the European Chapter of the Association for Computational Linguistics , 2006 .

[6]  Tong Zhang,et al.  Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding , 2015, NIPS.

[7]  Peng Zhou,et al.  Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling , 2016, COLING.

[8]  G. Frege On Sense and Reference , 1948 .

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Mirella Lapata,et al.  Composition in Distributional Models of Semantics , 2010, Cogn. Sci..

[11]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[12]  Tong Zhang,et al.  Effective Use of Word Order for Text Categorization with Convolutional Neural Networks , 2014, NAACL.

[13]  Wenpeng Yin,et al.  Multichannel Variable-Size Convolution for Sentence Classification , 2015, CoNLL.

[14]  Rui Zhang,et al.  Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents , 2016, NAACL.

[15]  Efstathios Stamatatos,et al.  Automatic Text Categorization In Terms Of Genre and Author , 2000, CL.

[16]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[19]  Marco Furini,et al.  Sentiment analysis and Twitter: a game proposal , 2018, Personal and Ubiquitous Computing.

[20]  Fabrizio Sebastiani,et al.  A Tutorial on Automated Text Categorisation , 2000 .

[21]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Weiping Zhang,et al.  Medical data fusion algorithm based on Internet of things , 2018, Personal and Ubiquitous Computing.

[23]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[24]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[25]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[26]  M. Bunge Sense and reference , 1974 .

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Cícero Nogueira dos Santos,et al.  Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts , 2014, COLING.

[30]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.