Towards the Prediction of Semantic Complexity Based on Concept Graphs

The evaluation of text complexity is an important topic in education. While this objective has been addressed by approaches using lexical and syntactic analysis for decades, semantic complexity is less common, and the recent research works that tackle this question rely on machine learning algorithms that are hardly explainable and are not specifically designed to measure this variable. To address this issue, we explore in this paper the engineering of novel features to evaluate conceptual complexity. Through the construction of a knowledge graph that captures the concepts present in a text and their generalized forms, we measure different graph-based metrics to express such a complexity. Eventually , early-stage evaluations based on a well-known public corpus of students' productions show that the use of these metrics significantly improves performance compared to a state-of-the-art binary neural network classifier.

[1]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[2]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[3]  Shashi Narayan,et al.  Hybrid Simplification using Deep Semantics and Machine Translation , 2014, ACL.

[4]  Miguel-Ángel Sicilia,et al.  Descriptive Analysis of Learning Object Material Types in MERLOT , 2010, MTSR.

[5]  Xiaofei Lu,et al.  Automatic analysis of syntactic complexity in second language writing , 2010 .

[6]  Erik Duval,et al.  User Context and Personalized Learning: a Federation of Contextualized Attention Metadata , 2010, J. Univers. Comput. Sci..

[7]  Xiaofei Lu Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment , 2017 .

[8]  Paulina A. Kulesz,et al.  Text-Processing Differences in Adolescent Adequate and Poor Comprehenders Reading Accessible and Challenging Narrative and Informational Text. , 2015 .

[9]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[10]  Stefan Trausan-Matu,et al.  ReaderBench: A Multi-lingual Framework for Analyzing Text Complexity , 2017, EC-TEL.

[11]  Jack Gilliland,et al.  The concept of readability , 1968 .

[12]  Jeroen Geertzen,et al.  Automatic Linguistic Annotation ofLarge Scale L2 Databases: The EF-Cambridge Open Language Database(EFCamDat) , 2014 .

[13]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[14]  Feng Gao,et al.  Towards semantic learning object metadata: mapping standard metadata specifications to ontologies , 2012, Proceedings of IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE) 2012.

[15]  Advaith Siddharthan,et al.  A survey of research on text simplification , 2014 .

[16]  Theodora Alexopoulou,et al.  Dependency parsing of learner English , 2018, International Journal of Corpus Linguistics.

[17]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[18]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Elnaz Davoodi,et al.  On the Contribution of Discourse Structure on Text Complexity Assessment , 2016, SIGDIAL Conference.

[21]  Selim Akyokus,et al.  Deep Learning- and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification , 2018, Complex..

[22]  Kathleen McKeown,et al.  Human-Centric Justification of Machine Learning Predictions , 2017, IJCAI.

[23]  Arne Jönsson,et al.  Features Indicating Readability in Swedish Text , 2013, NODALIDA.

[24]  Arne Jönsson,et al.  Classifying easy-to-read texts without parsing , 2014, PITR@EACL.

[25]  Stefan Trausan-Matu,et al.  ReaderBench, an Environment for Analyzing Text Complexity and Reading Strategies , 2013, AIED.

[26]  Robert Mundkowsky,et al.  Online Readability and Text Complexity Analysis with TextEvaluator , 2015, NAACL.

[27]  Birgit Kopainsky,et al.  Automated assessment of learners' understanding in complex dynamic systems , 2012 .

[28]  Sanja Stajner,et al.  Automatic Assessment of Conceptual Text Complexity Using Knowledge Graphs , 2018, COLING.

[29]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.