Syntactic based approach for grammar question retrieval

Abstract With the popularity of online educational platforms, English learners can learn and practice no matter where they are and what they do. English grammar is one of the important components in learning English. To learn English grammar effectively, it requires students to practice questions containing focused grammar knowledge. In this paper, we study a novel problem of retrieving English grammar questions with similar grammatical focus. Since the grammatical focus similarity is different from textual similarity or sentence syntactic similarity, existing approaches cannot be applied directly to our problem. To address this problem, we propose a syntactic based approach for English grammar question retrieval which can retrieve related grammar questions with similar grammatical focus effectively. In the proposed syntactic based approach, we first propose a new syntactic tree, namely parse-key tree, to capture English grammar questions’ grammatical focus. Next, we propose two kernel functions, namely relaxed tree kernel and part-of-speech order kernel, to compute the similarity between two parse-key trees of the query and grammar questions in the collection. Then, the retrieved grammar questions are ranked according to the similarity between the parse-key trees. In addition, if a query is submitted together with answer choices, conceptual similarity and textual similarity are also incorporated to further improve the retrieval accuracy. The performance results have shown that our proposed approach outperforms the state-of-the-art methods based on statistical analysis and syntactic analysis.

[1]  Tat-Seng Chua,et al.  The Use of Dependency Relation Graph to Enhance the Term Weighting in Question Retrieval , 2012, COLING.

[2]  Lenan Wu,et al.  A Syntactic Parse-Key Tree-Based Approach for English Grammar Question Retrieval , 2017, NLDB.

[3]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[4]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[5]  Rishiraj Saha Roy,et al.  Syntactic complexity of Web search queries through the lenses of language models, networks and users , 2016, Inf. Process. Manag..

[6]  Walt Detmar Meurers,et al.  Online Information Retrieval for Language Learning , 2016, ACL.

[7]  Sally I. McClean,et al.  Tree Similarity Measurement for Classifying Questions by Syntactic Structures , 2016, ICIC.

[8]  Mahmoud Al-Ayyoub,et al.  Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features , 2017, Inf. Process. Manag..

[9]  Somnath Banerjee,et al.  Multiple Choice Question (MCQ) Answering System for Entrance Examination , 2013, CLEF.

[10]  Chirag Shah,et al.  Evaluating high accuracy retrieval techniques , 2004, SIGIR '04.

[11]  Hongfang Liu,et al.  A Part-Of-Speech term weighting scheme for biomedical information retrieval , 2016, J. Biomed. Informatics.

[12]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[13]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.

[14]  Boris A. Galitsky,et al.  Matching sets of parse trees for answering multi-sentence questions , 2013, RANLP.

[15]  Miguel A. Alonso,et al.  Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval , 2016, Inf. Process. Manag..

[16]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[17]  Anne-Marie Brady,et al.  Assessment of learning with multiple-choice questions. , 2005, Nurse education in practice.

[18]  Min Feng,et al.  Question Similarity Calculation for FAQ Answering , 2007, Third International Conference on Semantics, Knowledge and Grid (SKG 2007).

[19]  W. Bruce Croft,et al.  Compact query term selection using topically related text , 2013, SIGIR.

[20]  Manish Agarwal,et al.  Automatic Question Generation using Discourse Cues , 2011, BEA@ACL.

[21]  Chris Callison-Burch,et al.  Answer Extraction as Sequence Tagging with Tree Edit Distance , 2013, NAACL.

[22]  Volker Markl,et al.  Semantification of Identifiers in Mathematics for Better Math Information Retrieval , 2016, SIGIR.

[23]  Tat-Seng Chua,et al.  Exploring Key Concept Paraphrasing Based on Pivot Language Translation for Question Retrieval , 2015, AAAI.

[24]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[25]  Alessandro Moschitti,et al.  Assessing the Impact of Syntactic and Semantic Structures for Answer Passages Reranking , 2015, CIKM.

[26]  Roberto Basili,et al.  A Tree Kernel approach to Question and Answer Classification in Question Answering Systems , 2006, LREC.

[27]  Idan Szpektor,et al.  Improving Term Weighting for Community Question Answering Search Using Syntactic Analysis , 2014, CIKM.

[28]  Mukta Majumder,et al.  A System for Generating Multiple Choice Questions: With a Novel Approach for Sentence Selection , 2015, NLP-TEA@ACL/IJCNLP.

[29]  W. Bruce Croft,et al.  Modeling higher-order term dependencies in information retrieval using query hypergraphs , 2012, SIGIR '12.

[30]  Alessandro Moschitti,et al.  Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees , 2006, ECML.

[31]  W. Bruce Croft,et al.  A quasi-synchronous dependence model for information retrieval , 2011, CIKM '11.

[32]  Rosie Jones,et al.  The Linguistic Structure of English Web-Search Queries , 2008, EMNLP.

[33]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[34]  Stephan Bloehdorn,et al.  Structure and semantics for expressive text kernels , 2007, CIKM '07.

[35]  Ryuichiro Higashinaka,et al.  Syntactic Filtering and Content-Based Retrieval of Twitter Sentences for the Generation of System Utterances in Dialogue Systems , 2016 .