QuesNet: A Unified Representation for Heterogeneous Test Questions

Understanding learning materials (e.g. test questions) is a crucial issue in online learning systems, which can promote many applications in education domain. Unfortunately, many supervised approaches suffer from the problem of scarce human labeled data, whereas abundant unlabeled resources are highly underutilized. To alleviate this problem, an effective solution is to use pre-trained representations for question understanding. However, existing pre-training methods in NLP area are infeasible to learn test question representations due to several domain-specific characteristics in education. First, questions usually comprise of heterogeneous data including content text, images and side information. Second, there exists both basic linguistic information as well as domain logic and knowledge. To this end, in this paper, we propose a novel pre-training method, namely QuesNet, for comprehensively learning question representations. Specifically, we first design a unified framework to aggregate question information with its heterogeneous inputs into a comprehensive vector. Then we propose a two-level hierarchical pre-training algorithm to learn better understanding of test questions in an unsupervised way. Here, a novel holed language model objective is developed to extract low-level linguistic features, and a domain-oriented objective is proposed to learn high-level logic and knowledge. Moreover, we show that QuesNet has good capability of being fine-tuned in many question-based tasks. We conduct extensive experiments on large-scale real-world question data, where the experimental results clearly demonstrate the effectiveness of QuesNet for question understanding as well as its superior applicability.

[1]  Dit-Yan Yeung,et al.  Dynamic Key-Value Memory Networks for Knowledge Tracing , 2016, WWW.

[2]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[3]  Xing Xie,et al.  Transcribing Content from Structural Images with Spotlight Mechanism , 2018, KDD.

[4]  Arthur C. Graesser,et al.  Question Understanding Aid (QUAID) A Web Facility that Tests Question Comprehensibility , 2006 .

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  K. Chellamani,et al.  ANALYSIS OF TEST ITEMS ON DIFFICULTY LEVEL AND DISCRIMINATION INDEX IN THE TEST FOR RESEARCH IN EDUCATION , 2013 .

[7]  Xiaodi Huang,et al.  An E-learning System Architecture based on Cloud Computing , 2012 .

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Xiaodi Huang,et al.  A Novel Approach for Adopting Cloud-Based E-learning System , 2012, 2012 IEEE/ACIS 11th International Conference on Computer and Information Science.

[11]  Jure Leskovec,et al.  Engaging with massive online courses , 2014, WWW.

[12]  Yong Yu,et al.  Searching Questions by Identifying Question Topic and Question Focus , 2008, ACL.

[13]  Michel C. Desmarais,et al.  Item to Skills Mapping: Deriving a Conjunctive Q-matrix from Data , 2012, ITS.

[14]  Ulf Hermjakob,et al.  Parsing and Question Classification for Question Answering , 2001, ACL 2001.

[15]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[16]  Liang Zhang,et al.  CADEN: A Context-Aware Deep Embedding Network for Financial Opinions Mining , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[17]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[18]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[19]  Leonidas J. Guibas,et al.  Deep Knowledge Tracing , 2015, NIPS.

[20]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[21]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[22]  Adnan Baki,et al.  Design and development of an innovative individualized adaptive and intelligent e-learning system for teaching-learning of probability unit: Details of UZWEBMAT , 2013, Expert Syst. Appl..

[23]  A. A. G. Putra,et al.  PIECING TOGETHER THE STUDENT SUCCESS PUZZLE: RESEARCH, PROPOSITIONS, AND RECOMMENDATIONS , 2013 .

[24]  Hermann Ney,et al.  LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[25]  Mike Moore,et al.  Distance Education: A Systems View , 1995 .

[26]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  David E. Douglas,et al.  Effectiveness of E-Learning Course Materials for Learning Database Management Systems: An Experimental Investigation , 2004, J. Comput. Inf. Syst..

[29]  Enhong Chen,et al.  Question Difficulty Prediction for READING Problems in Standard Tests , 2017, AAAI.

[30]  Enhong Chen,et al.  Exercise-Enhanced Sequential Modeling for Student Performance Prediction , 2018, AAAI.

[31]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[32]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[33]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[34]  Seymour Sudman,et al.  Answering Questions: Methodology for Determining Cognitive and Communicative Processes in Survey Research , 1995 .

[35]  Bowen Zhou,et al.  LSTM-based Deep Learning Models for non-factoid answer selection , 2015, ArXiv.

[36]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.