A multi-label text classification method via dynamic semantic representation model and deep neural network

The increment of new words and text categories requires more accurate and robust classification methods. In this paper, we propose a novel multi-label text classification method that combines dynamic semantic representation model and deep neural network (DSRM-DNN). DSRM-DNN first utilizes word embedding model and clustering algorithm to select semantic words. Then the selected words are designated as the elements of DSRM-DNN and quantified by the weighted combination of word attributes. Finally, we construct a text classifier by combining deep belief network and back-propagation neural network. During the classification process, the low-frequency words and new words are re-expressed by the existing semantic words under sparse constraint. We evaluate the performance of DSRM-DNN on RCV1-v2, Reuters-21578, EUR-Lex, and Bookmarks. Experimental results show that our method outperforms the state-of-the-art methods.

[1]  Fernando Benites,et al.  HARAM: A Hierarchical ARAM Neural Network for Large-Scale Text Classification , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[2]  Essam Said Hanandeh,et al.  A novel hybridization strategy for krill herd algorithm applied to clustering techniques , 2017, Appl. Soft Comput..

[3]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[4]  Jiande Sun,et al.  Semantic consistency cross-modal dictionary learning with rank constraint , 2019, J. Vis. Commun. Image Represent..

[5]  Zhenchang Xing,et al.  Ensemble application of convolutional and recurrent neural networks for multi-label text categorization , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[6]  Laith Mohammad Abualigah,et al.  A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis , 2018, Eng. Appl. Artif. Intell..

[7]  Zhi-Hua Zhou,et al.  Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Luis M. de Campos,et al.  Bayesian network models for hierarchical text classification from a thesaurus , 2009, Int. J. Approx. Reason..

[9]  Xiang Bai,et al.  Text/non-text image classification in the wild with convolutional neural networks , 2017, Pattern Recognit..

[10]  Fuzhen Zhuang,et al.  Supervised representation learning for multi-label classification , 2019, Machine Learning.

[11]  Hong Liu,et al.  A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm , 2018, Appl. Soft Comput..

[12]  Bassam Al-Salemi,et al.  RFBoost: An improved multi-label boosting algorithm and its application to text categorisation , 2016, Knowl. Based Syst..

[13]  Amanda Clare,et al.  Knowledge Discovery in Multi-label Phenotype Data , 2001, PKDD.

[14]  Laith Mohammad Abualigah,et al.  Hybrid clustering analysis using improved krill herd algorithm , 2018, Applied Intelligence.

[15]  Siddhartha R. Jonnalagadda,et al.  PDF text classification to leverage information extraction from publication reports , 2016, J. Biomed. Informatics.

[16]  Peng Wang,et al.  Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification , 2016, Neurocomputing.

[17]  Nenghai Yu,et al.  Semantics-Preserving Bag-of-Words Models and Applications , 2010, IEEE Transactions on Image Processing.

[18]  Dino Isa,et al.  An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization , 2011, Applied Intelligence.

[19]  Antonio Maria Rinaldi,et al.  A content-based approach for document representation and retrieval , 2008, DocEng '08.

[20]  Renato A. Krohling,et al.  Restricted Boltzmann machine to determine the input weights for extreme learning machines , 2017, Expert Syst. Appl..

[21]  Na Zhang,et al.  Graph steered discriminative projections based on collaborative representation for Image recognition , 2019, Multimedia Tools and Applications.

[22]  Aida Mustapha,et al.  K-Means Clustering to Improve the Accuracy of Decision Tree Response Classification , 2009 .

[23]  Giuseppe De Pietro,et al.  Deep neural network for hierarchical extreme multi-label text classification , 2019, Appl. Soft Comput..

[24]  Laith Mohammad Abualigah,et al.  Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering , 2017, The Journal of Supercomputing.

[25]  Charless C. Fowlkes,et al.  Do We Need More Training Data? , 2015, International Journal of Computer Vision.

[26]  Hong Liu,et al.  Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism , 2018, Inf. Sci..

[27]  Dalwinder Singh,et al.  Hybridization of feature selection and feature weighting for high dimensional data , 2018, Applied Intelligence.

[28]  Bo Yu,et al.  A comparative study for content-based dynamic spam classification using four machine learning algorithms , 2008, Knowl. Based Syst..

[29]  Lei Zhu,et al.  Adversarial cross-modal retrieval based on dictionary learning , 2019, Neurocomputing.

[30]  Qingshan Jiang,et al.  Feature selection via maximizing global information gain for text classification , 2013, Knowl. Based Syst..

[31]  Kan Li,et al.  Text Categorization Based on Topic Model , 2008, RSKT.

[32]  Ausif Mahmood,et al.  Efficient Deep Learning Model for Text Classification Based on Recurrent and Convolutional Layers , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[33]  Ajith Abraham,et al.  Improving kNN Text Categorization by Removing Outliers from Training Set , 2006, CICLing.