Legal Question Answering Using Ranking SVM and Syntactic/Semantic Similarity

We describe a legal question answering system which combines legal information retrieval and textual entailment. We have evaluated our system using the data from the first competition on legal information extraction/entailment (COLIEE) 2014. The competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams. The shared task consists of two phases: legal ad hoc information retrieval and textual entailment. The first phase requires the identification of Japan civil law articles relevant to a legal bar exam query. We have implemented two unsupervised baseline models (tf-idf and Latent Dirichlet Allocation (LDA)-based Information Retrieval (IR)), and a supervised model, Ranking SVM, for the task. The features of the model are a set of words, and scores of an article based on the corresponding baseline models. The results show that the Ranking SVM model nearly doubles the Mean Average Precision compared with both baseline models. The second phase is to answer “Yes” or “No” to previously unseen queries, by comparing the meanings of queries with relevant articles. The features used for phase two are syntactic/semantic similarities and identification of negation/antonym relations. The results show that our method, combined with rule-based model and the unsupervised model, outperforms the SVM-based supervised model.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Christopher D. Manning,et al.  Learning to recognize features of valid textual entailments , 2006, NAACL.

[3]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[4]  Arul Menezes,et al.  Syntactic Contributions in the Entailment Task: an implementation , 2005 .

[5]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[6]  Mi-Young Kim,et al.  Answering Yes/No Questions in Legal Bar Exams , 2013, JSAI-isAI Workshops.

[7]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[8]  Emiel Krahmer,et al.  Dependency-based paraphrasing for recognizing textual entailment , 2007, ACL-PASCAL@ACL.

[9]  Alice Lai,et al.  Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Arul Menezes,et al.  Effectively Using Syntax for Recognizing False Entailment , 2006, NAACL.

[12]  Alessandro Moschitti,et al.  Learning textual entailment from examples , 2006 .

[13]  Stefan Harmeling An Extensible Probabilistic Transformation-based Approach to the Third Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[14]  Mi-Young Kim,et al.  Resolving Ambiguity in Inter-chunk Dependency Parsing , 2001, NLPRS.

[15]  Valentin Jijkoun,et al.  Recognizing Textual Entailment Using Lexical Similarity , 2005 .

[16]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[17]  Marcin Walas How to Answer Yes/No Spatial Questions Using Qualitative Reasoning? , 2012, CICLing.

[18]  Natheer K. Gharaibeh,et al.  Development of Yes/No Arabic Question Answering System , 2013, ArXiv.