Sequence Generation Model Integrating Domain Ontology for Mathematical question tagging

In online learning systems, tagging knowledge points for questions is a fundamental task. Automatic tagging technology uses intelligent algorithms to automatically tag knowledge points for questions to reduce manpower and time costs. However, the current knowledge point tagging technology cannot satisfy the situation that mathematics questions often involve a variable number of knowledge points, lacks the consideration of the characteristics of the mathematics field, and ignores the internal connection between knowledge points. To address the above issues, we propose a Sequence Generation Model Integrating Domain Ontology for Mathematical question tagging (SOMPT). SOMPT performs data augmentation for text and then obtains intermediate text based on domain ontology replacement to facilitate deep learning model to understand mathematical question text. SOMPT is able to obtain dynamic word vector embedding to optimize the textual representation for math questions. What’s more, our model can capture the relationship between tags to generate knowledge points more accurately in the way of sequence generation. The comparative experimental results show that our proposed model has an excellent tagging ability for mathematical questions. Moreover, the sequence generation module in SOMPT can be applied on other multi-label classification tasks and be on par with the state-of-the-art performance models.

[1]  Chao Huang,et al.  BERT-based chinese text classification for emergency management with a novel loss function , 2022, Applied Intelligence.

[2]  Shuhao Gu,et al.  Guiding Teacher Forcing with Seer Forcing for Neural Machine Translation , 2021, ACL.

[3]  Xiong Luo,et al.  BERT-based Chinese Text Classification for Emergency Domain with a Novel Loss Function , 2021, ArXiv.

[4]  H. Yang,et al.  Shaping the future learning environments with smart elements: challenges and opportunities , 2021, International journal of educational technology in higher education.

[5]  Setareh Maghsudi,et al.  Personalized Education in the Artificial Intelligence Era: What to Expect Next , 2021, IEEE Signal Processing Magazine.

[6]  W. Lu,et al.  Self-Attention-Based Convolutional Neural Networks for Sentence Classification , 2020, 2020 IEEE 6th International Conference on Computer and Communications (ICCC).

[7]  Zhen-tao Ni,et al.  Research on knowledge graph model of diversified online resources and personalized recommendation , 2020, Journal of Physics: Conference Series.

[8]  Yakun Lang,et al.  Personalized knowledge point recommendation system based on course knowledge graph , 2020, Journal of Physics: Conference Series.

[9]  Yu Song,et al.  A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification , 2020, IEEE Access.

[10]  Bo Sun,et al.  Tagging Reading Comprehension Materials With Document Extraction Attention Networks , 2020, IEEE Transactions on Learning Technologies.

[11]  Lionel M. Ni,et al.  Knowledge modeling via contextualized representations for LSTM-based personalized exercise recommendation , 2020, Inf. Sci..

[12]  Haitao Pu,et al.  Personalized Learning Service Based on Big Data for Education , 2020, 2020 IEEE 2nd International Conference on Computer Science and Educational Informatization (CSEI).

[13]  Qiang Zhang,et al.  A TextCNN Based Approach for Multi-label Text Classification of Power Fault Data , 2020, 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA).

[14]  Yu Wang,et al.  Improved sequence generation model for multi-label classification via CNN and initialized fully connection , 2020, Neurocomputing.

[15]  Yingjie Tian,et al.  Joint Ranking SVM and Binary Relevance with Robust Low-Rank Learning for Multi-Label Classification , 2019, Neural Networks.

[16]  Paul Magron,et al.  Language Modelling for Sound Event Detection with Teacher Forcing and Scheduled Sampling , 2019, DCASE.

[17]  Xiaodong Liu,et al.  Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[18]  Cheng Li,et al.  Adapting RNN Sequence Prediction Model to Multi-label Set Prediction , 2019, NAACL.

[19]  Xu Sun,et al.  Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification , 2018, EMNLP.

[20]  Wei Wu,et al.  SGM: Sequence Generation Model for Multi-label Classification , 2018, COLING.

[21]  Penghe Chen,et al.  KnowEdu: A System to Construct Knowledge Graph for Education , 2018, IEEE Access.

[22]  Vladlen Koltun,et al.  An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[23]  Ivan Srba,et al.  Education-specific Tag Recommendation in CQA Systems , 2017, UMAP.

[24]  Atsushi Fujii,et al.  Mathematical Document Categorization with Structure of Mathematical Expressions , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Zhenchang Xing,et al.  Ensemble application of convolutional and recurrent neural networks for multi-label text categorization , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[27]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[28]  Peng Wang,et al.  Semantic Clustering and Convolutional Neural Network for Short Text Categorization , 2015, ACL.

[29]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[30]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[31]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[32]  Luo Si,et al.  A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems , 2010, EDM 2010.

[33]  Zhong Xiu,et al.  Geometry Knowledge Acquisition and Representation on Ontology: Geometry Knowledge Acquisition and Representation on Ontology , 2010 .

[34]  Li She,et al.  Geometry Knowledge Acquisition and Representation on Ontology , 2009, 2009 International Conference on Computational Intelligence and Software Engineering.

[35]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[36]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[37]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[38]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[39]  Jacob Barhen,et al.  Learning a trajectory using adjoint functions and teacher forcing , 1992, Neural Networks.

[40]  Xue Fei,et al.  An LDA based model for semantic annotation of Web English educational resources , 2021, J. Intell. Fuzzy Syst..

[41]  Vicente Julián,et al.  Classification of educational videos by using a semi-supervised learning method on transcripts and keywords , 2021, Neurocomputing.

[42]  Yongkang Xiao,et al.  Automatic Question Tagging with Deep Neural Networks , 2019, IEEE Transactions on Learning Technologies.

[43]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[44]  Johannes Fürnkranz,et al.  Maximizing Subset Accuracy with Recurrent Neural Networks in Multi-label Classification , 2017, NIPS.