ALBERT over Match-LSTM Network for Intelligent Questions Classification in Chinese

This paper introduces a series of experiments with an ALBERT over match-LSTM network on the top of pre-trained word vectors, for accurate classification of intelligent question answering and thus the guarantee of precise information service. To improve the performance of data classification, a short text classification method based on an ALBERT and match-LSTM model was proposed to overcome the limitations of the classification process, such as few vocabularies, sparse features, large amount of data, lots of noise and poor normalization. In the model, Jieba word segmentation tools and agricultural dictionary were selected to text segmentation, GloVe algorithm was then adopted to expand the text characteristic and weighted word vector according to the text of key vector, bi-directional gated recurrent unit was applied to catch the context feature information and multi-convolutional neural networks were finally established to gain local multidimensional characteristics of text. Batch normalization, Dropout, Global Average Pooling and Global Max Pooling were utilized to solve overfitting problem. The results showed that the model could classify questions accurately, with a precision of 96.8%. Compared with other classification models, such as multi-SVM model and CNN model, ALBERT+match-LSTM had obvious advantages in classification performance in intelligent Agri-tech information service.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Richard Socher,et al.  Dynamic Coattention Networks For Question Answering , 2016, ICLR.

[3]  Ting Liu,et al.  Consensus Attention-based Neural Networks for Chinese Reading Comprehension , 2016, COLING.

[4]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[5]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[6]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[7]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8]  Ellen M. Voorhees,et al.  Overview of the TREC 2002 Question Answering Track , 2003, TREC.

[9]  Jason Weston,et al.  The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.

[10]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[11]  Ting Liu,et al.  Attention-over-Attention Neural Networks for Reading Comprehension , 2016, ACL.

[12]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[13]  Thomas A. Darden,et al.  Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method , 2001, Bioinform..

[14]  Jun Wang,et al.  Character-level Convolutional Network for Text Classification Applied to Chinese Corpus , 2016, ArXiv.

[15]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.

[16]  Shahidan M. Abdullah,et al.  Advantage and drawback of support vector machine functionality , 2014, 2014 International Conference on Computer, Communications, and Control Technology (I4CT).

[17]  Sufeng Duan,et al.  Attention Is All You Need for Chinese Word Segmentation , 2019, EMNLP.

[18]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[19]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[20]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[21]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[22]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[23]  Ming Zhou,et al.  Gated Self-Matching Networks for Reading Comprehension and Question Answering , 2017, ACL.

[24]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[25]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[26]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[27]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[28]  Svetlana Stoyanchev,et al.  Exact Phrases in Information Retrieval for Question Answering , 2008, COLING 2008.

[29]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[30]  Richard Socher,et al.  Dynamic Memory Networks for Visual and Textual Question Answering , 2016, ICML.

[31]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[32]  Lishuang Li,et al.  Two-phase biomedical named entity recognition using CRFs , 2009, Comput. Biol. Chem..

[33]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[34]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.