Concept Hierarchy Extraction from Textbooks

Concept hierarchies have been useful tools for presenting and organizing knowledge. With the rapid growth in the number of online knowledge resources, automatic concept hierarchy extraction is increasingly attractive. Here, we focus on concept extraction from textbooks based on the knowledge in Wikipedia. Given a book, we extract important concepts in each book chapter using Wikipedia as a resource and from this construct a concept hierarchy for that book. We define local and global features that capture both the local relatedness and global coherence embedded in that textbook. In order to evaluate the proposed features and extracted concept hierarchies, we manually construct concept hierarchies for three well used textbooks by labeling important concepts for each book chapter. Experiments show that our proposed local and global features achieve better performance than using only keyphrases to construct the concept hierarchies. Moreover, we observe that incorporating global features can improve the concept ranking precision and reaffirms the global coherence in the book.

[1]  Benjamin Bräutigam,et al.  BBookX: An Automatic Book Creation Framework , 2015, DocEng.

[2]  Yiming Yang,et al.  Concept Graph Learning from Educational Data , 2015, WSDM.

[3]  Ana Arruarte Lasa,et al.  Automatic Generation of the Domain Module from Electronic Textbooks: Method and Validation , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  David E. Millard,et al.  Automatic Ontology-Based Knowledge Extraction from Web Documents , 2003, IEEE Intell. Syst..

[5]  Arthur C. Graesser,et al.  QUEST: A cognitive model of question answering , 1990 .

[6]  Zhaohui Wu,et al.  Can back-of-the-book indexes be automatically created? , 2013, CIKM.

[7]  Ian H. Witten,et al.  Human-competitive tagging using automatic keyphrase extraction , 2009, EMNLP.

[8]  Ian H. Witten,et al.  An effective, low-cost measure of semantic relatedness obtained from Wikipedia links , 2008 .

[9]  M. G. Jones,et al.  The concept map as a research and evaluation tool: Further evidence of validity , 1994 .

[10]  Stephen Downes E-learning 2.0 , 2005, ELERN.

[11]  Zhaohui Wu,et al.  Table of Contents Recognition and Extraction for Heterogeneous Book Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[13]  Richard T. Gill,et al.  Conceptual Graph Analysis: Knowledge Acquisition for Instructional System Design , 1993 .

[14]  I. O. Oyefolahan,et al.  Encouraging Knowledge Sharing Using Web 2.0 Technologies In Higher Education: A Survey , 2014, ArXiv.

[15]  J. Mintzes,et al.  The concept map as a research tool: Exploring conceptual change in biology , 1990 .

[16]  Silviu Cucerzan,et al.  Large-Scale Named Entity Disambiguation Based on Wikipedia Data , 2007, EMNLP.

[17]  Lei Liu,et al.  Generating reading orders over document collections , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[18]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[19]  H. Suen,et al.  Concept Map Assessment of Classroom Learning: Reliability, Validity, and Logistical Practicality , 1999 .

[20]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[21]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[22]  Xianpei Han,et al.  Named entity disambiguation by leveraging wikipedia semantic knowledge , 2009, CIKM.

[23]  Wolff‐Michael Roth,et al.  The Social Construction of Scientific Concepts or the Concept Map as Conscription Device and Tool for Social Thinking in High School Science. , 1992 .

[24]  Gwo-Jen Hwang,et al.  A concept map approach to developing collaborative Mindtools for context-aware ubiquitous learning , 2011, Br. J. Educ. Technol..

[25]  Sreenivas Gollapudi,et al.  Data mining for improving textbooks , 2012, SKDD.

[26]  Romaric Besançon,et al.  Text Mining, knowledge extraction from unstructured textual data , 1998 .

[27]  Zhaohui Wu,et al.  Measuring Term Informativeness in Context , 2013, NAACL.

[28]  Ian H. Witten,et al.  Learning to link with wikipedia , 2008, CIKM '08.

[29]  C. Lee Giles,et al.  SimSeerX: a similar document search engine , 2014, DocEng '14.

[30]  Razvan C. Bunescu,et al.  Using Encyclopedic Knowledge for Named entity Disambiguation , 2006, EACL.

[31]  Andrew Olney Extraction of Concept Maps from Textbooks for Domain Modeling , 2010, Intelligent Tutoring Systems.

[32]  Wolff-Michael Roth,et al.  The Social Construction of Scientific Concepts or the Concept Map as Device and Tool Thinking in High Conscription for Social School Science , 1992 .