KnowEdu: A System to Construct Knowledge Graph for Education

Motivated by the vast applications of knowledge graph and the increasing demand in education domain, we propose a system, called KnowEdu, to automatically construct knowledge graph for education. By leveraging on heterogeneous data (e.g., pedagogical data and learning assessment data) from the education domain, this system first extracts the concepts of subjects or courses and then identifies the educational relations between the concepts. More specifically, it adopts the neural sequence labeling algorithm on pedagogical data to extract instructional concepts and employs probabilistic association rule mining on learning assessment data to identify the relations with educational significance. We detail all the abovementioned efforts through an exemplary case of constructing a demonstrative knowledge graph for mathematics, where the instructional concepts and their prerequisite relations are derived from curriculum standards and concept-based performance data of students. Evaluation results show that the F1 score for concept extraction exceeds 0.70, and for relation identification, the area under the curve and mean average precision achieve 0.95 and 0.87, respectively.

[1]  Jean-Claude Falmagne,et al.  Spaces for the Assessment of Knowledge , 1985, Int. J. Man Mach. Stud..

[2]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[3]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[4]  Yiming Yang,et al.  Data-driven Automated Induction of Prerequisite Structure Graphs , 2016, EDM.

[5]  Etienne Wenger,et al.  Artificial Intelligence and Tutoring Systems: Computational and Cognitive Approaches to the Communication of Knowledge , 1987 .

[6]  Lidia S. Chao,et al.  Chinese Named Entity Recognition with Conditional Random Fields in the Light of Chinese Characteristics , 2013, IIS.

[7]  Eric C. Jensen,et al.  Retr ieving OCR Text : A Survey of Current Approaches , 2002 .

[8]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[9]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[12]  Reynold Cheng,et al.  Mining uncertain data with probabilistic guarantees , 2010, KDD.

[13]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[14]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[15]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[16]  Yiming Yang,et al.  Learning Concept Graphs from Online Educational Data , 2016, J. Artif. Intell. Res..

[17]  Kaisheng Yao,et al.  Depth-Gated Recurrent Neural Networks , 2015 .

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[20]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[21]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[23]  Jun Zhao,et al.  Distant Supervision for Relation Extraction with Sentence-Level Attention and Entity Descriptions , 2017, AAAI.

[24]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[25]  Mitchell J. Nathan,et al.  Expert Blind Spot Among Preservice Teachers , 2003 .

[26]  Zhaohui Wu,et al.  Recovering Concept Prerequisite Relations from University Course Dependencies , 2017, AAAI.

[27]  Heiko Paulheim,et al.  Knowledge graph refinement: A survey of approaches and evaluation methods , 2016, Semantic Web.

[28]  Zhiyong Lu,et al.  tmChem: a high performance approach for chemical named entity recognition and normalization , 2015, Journal of Cheminformatics.

[29]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[30]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[31]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[32]  Benjamin Bräutigam,et al.  Concept Hierarchy Extraction from Textbooks , 2015, DocEng.

[33]  Hoifung Poon,et al.  Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[34]  Yifan Gong,et al.  An Overview of Noise-Robust Automatic Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.