Syntactic, Semantic and Sentiment Analysis: The Joint Effect on Automated Essay Evaluation

Manual grading of essays by humans is time-consuming and likely to be susceptible to inconsistencies and inaccuracies. In recent years, an abundance of research has been done to automate essay evaluation processes, yet little has been done to take into consideration the syntax, semantic coherence and sentiments of the essay’s text together. Our proposed system incorporates not just the rule-based grammar and surface level coherence check but also includes the semantic similarity of the sentences. We propose to use Graph-based relationships within the essay’s content and polarity of opinion expressions. Semantic similarity is determined between each statement of the essay to form these Graph-based spatial relationships and novel features are obtained from it. Our algorithm uses 23 salient features with high predictive power, which is less than the current systems while considering every aspect to cover the dimensions that a human grader focuses on. Fewer features help us get rid of the redundancies of the data so that the predictions are based on more representative features and are robust to noisy data. The prediction of the scores is done with neural networks using the data released by the ASAP competition held by Kaggle. The resulting agreement between human grader’s score and the system’s prediction is measured using Quadratic Weighted Kappa (QWK). Our system produces a QWK of 0.793.

[1]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[2]  F. Harary,et al.  Eccentricity and centrality in networks , 1995 .

[3]  Xia Li,et al.  Relevance-Based Automated Essay Scoring via Hierarchical Recurrent Model , 2018, 2018 International Conference on Asian Language Processing (IALP).

[4]  Andrew A. Tawfik,et al.  Overcoming the PBL Assessment Challenge: Design and Development of the Incremental Thesaurus for Assessing Causal Maps (ITACM) , 2019, Technol. Knowl. Learn..

[5]  Vijay Kumar Mago,et al.  Birds of prey: identifying lexical irregularities in spam on Twitter , 2018 .

[6]  Chris Callison-Burch,et al.  Magnitude: A Fast, Efficient Universal Vector Embedding Utility Package , 2018, EMNLP.

[7]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[8]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[9]  E. B. Page,et al.  The use of the computer in analyzing student essays , 1968 .

[10]  Philippe J. Giabbanelli,et al.  Learning Analytics to Support Teachers’ Assessment of Problem Solving: A Novel Application for Machine Learning and Graph Algorithms , 2019, Utilizing Learning Analytics to Support Study Success.

[11]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[12]  Ronald L. Graham,et al.  On the History of the Minimum Spanning Tree Problem , 1985, Annals of the History of Computing.

[13]  Ayman Mohamed Mostafa An Evaluation of Sentiment Analysis and Classification Algorithms for Arabic Textual Data , 2017 .

[14]  Siu Cheung Hui,et al.  SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring , 2017, AAAI.

[15]  Daniel Marcu,et al.  Evaluating Multiple Aspects of Coherence in Student Essays , 2004, NAACL.

[16]  Beata Beigman Klebanov,et al.  Building Subjectivity Lexicon(s) from Scratch for Essay Data , 2012, CICLing.

[17]  Ben He,et al.  A Ranked-Based Learning Approach to Automated Essay Scoring , 2012, 2012 Second International Conference on Cloud and Green Computing.

[18]  Beata Beigman Klebanov,et al.  Content Importance Models for Scoring Writing From Sources , 2014, ACL.

[19]  Hwee Tou Ng,et al.  Flexible Domain Adaptation for Automated Essay Scoring Using Correlated Linear Regression , 2015, EMNLP.

[20]  Eric D. Kolaczyk,et al.  Descriptive Analysis of Network Graph Characteristics , 2014 .

[21]  Jill Burstein,et al.  The E-rater® scoring engine: Automated essay scoring with natural language processing. , 2003 .

[22]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[23]  Vijay K. Mago,et al.  Extracting Learning Outcomes Using Machine Learning and White Space Analysis , 2018, Goodtechs '18.

[24]  Michael Gamon Graph-Based Text Representation for Novelty Detection , 2006 .

[25]  Karen Kukich,et al.  Automated Evaluation of Coherence in Student Essays , .

[26]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[27]  Siti Hamin Stapa,et al.  Analysis of errors in subject-verb agreement among Malaysian ESL learners , 2010 .

[28]  Tristan Miller,et al.  Essay Assessment with Latent Semantic Analysis , 2003 .

[29]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .

[30]  Steven Skiena,et al.  The Expressive Power of Word Embeddings , 2013, ArXiv.

[31]  Linda H. Pesante,et al.  Existential there , 1988 .

[32]  Marie-Laure Mugnier,et al.  Graph-based Knowledge Representation - Computational Foundations of Conceptual Graphs , 2008, Advanced Information and Knowledge Processing.

[33]  Kaja Zupanc,et al.  Automated essay evaluation with semantic analysis , 2017, Knowl. Based Syst..

[34]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[35]  Vijay K. Mago,et al.  Challenging the Boundaries of Unsupervised Learning for Semantic Similarity , 2019, IEEE Access.

[36]  Sandra Stotsky The Vocabulary of Essay Writing: Can It Be Taught?. , 1981 .

[37]  Naoko Taguchi,et al.  What Linguistic Features Are Indicative of Writing Quality? A Case of Argumentative Essays in a College Composition Program , 2013 .

[38]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[39]  Claudia Leacock,et al.  Automated evaluation of essays and short answers , 2001 .

[40]  Clinton I. Chase,et al.  ESSAY TEST SCORES AND READING DIFFICULTY , 1983 .

[41]  Diane J. Litman,et al.  Ontology-Based Argument Mining and Automatic Essay Scoring , 2014, ArgMining@ACL.

[42]  Rohini K. Srihari,et al.  Graph-based text representation and knowledge discovery , 2007, SAC '07.

[43]  Noura Farra,et al.  Scoring Persuasive Essays Using Opinions and their Targets , 2015, BEA@NAACL-HLT.

[44]  Liang Xiao,et al.  Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning , 2017, NIPS.

[45]  Jill Burstein,et al.  Automated Essay Scoring : A Cross-disciplinary Perspective , 2003 .

[46]  Peter W. Foltz,et al.  Implementation and Applications of the Intelligent Essay Assessor , 2013 .

[47]  Vijay Mago,et al.  Are we on the same learning curve: Visualization of Semantic Similarity of Course Objectives , 2018, ArXiv.

[48]  Nozha Boujemaa,et al.  Conditionally Positive Definite Kernels for SVM Based Image Recognition , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[49]  Olena Medelyan,et al.  Computing Lexical Chains with Graph Clustering , 2007, ACL.

[50]  Danushka Bollegala,et al.  Measuring semantic similarity between words using web search engines , 2007, WWW '07.

[51]  K. Ahmad Affective Computing and Sentiment Analysis: Emotion, Metaphor and Terminology , 2011 .

[52]  Frans Coenen,et al.  Text Classification using Graph Mining-based Feature Extraction , 2010, SGAI Conf..

[53]  Michael D. Buhrmester,et al.  Amazon's Mechanical Turk , 2011, Perspectives on psychological science : a journal of the Association for Psychological Science.

[54]  Edward R. Dougherty,et al.  Performance of feature-selection methods in the classification of high-dimension data , 2009, Pattern Recognit..

[55]  Iryna Gurevych,et al.  Identifying Argumentative Discourse Structures in Persuasive Essays , 2014, EMNLP.

[56]  Fabrício Benevenuto,et al.  A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods , 2015, ArXiv.

[57]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[58]  Janis E. Johnston,et al.  Weighted Kappa for Multiple Raters , 2008, Perceptual and motor skills.

[59]  Cliff Goddard,et al.  Semantic Analysis: A Practical Introduction , 1998 .

[60]  Venkata Sai Pillutla,et al.  Helping users learn about social processes while learning from users : developing a positive feedback in social computing , 2017 .

[61]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[62]  E. Matthew Schulz,et al.  Using Automated Scoring to Monitor Reader Performance and Detect Reader Drift in Essay Scoring , 2013 .

[63]  Caroline Golder,et al.  Writing argumentative text: A developmental study of the acquisition of supporting structures , 1993 .

[64]  Mark Last,et al.  Graph-Based Keyword Extraction for Single-Document Summarization , 2008, COLING 2008.

[65]  Ben He,et al.  TDNN: A Two-stage Deep Neural Network for Prompt-independent Automated Essay Scoring , 2018, ACL.

[66]  Matthew T. Schultz The IntelliMetric™ Automated Essay Scoring Engine – A Review and an Application to Chinese Essay Scoring , 2013 .

[67]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[68]  L. Norton,et al.  Essay-writing: what really counts? , 1990 .

[69]  Su-Youn Yoon,et al.  Evaluating Argumentative and Narrative Essays using Graphs , 2016, COLING.

[70]  Beata Beigman Klebanov,et al.  Sentiment profiles of multiword expressions in test-taker essays: The case of noun-noun compounds , 2013, TSLP.

[71]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[72]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[73]  Saadiyah Darus,et al.  Error Analysis of the Written English Essays of Secondary School Students in Malaysia: A Case Study , 2009 .

[74]  Matthijs Douze,et al.  FastText.zip: Compressing text classification models , 2016, ArXiv.

[75]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[76]  Jitendra Kumar,et al.  Long Short Term Memory Recurrent Neural Network (LSTM-RNN) Based Workload Forecasting Model For Cloud Datacenters , 2018 .

[77]  Philippe J. Giabbanelli,et al.  An Online Environment to Compare Students' and Expert Solutions to Ill-Structured Problems , 2018, HCI.

[78]  Stevan Harnad,et al.  How is Meaning Grounded in Dictionary Definitions? , 2008, COLING 2008.

[79]  Nitin Madnani,et al.  Automated Sentiment Analysis for Essay Evaluation , 2013 .

[80]  Dudley W. Reynolds Repetition in Nonnative Speaker Writing: More than Quantity , 1995 .