论文信息 - Threshold-Based Retrieval and Textual Entailment Detection on Legal Bar Exam Questions - 字舞流文

Threshold-Based Retrieval and Textual Entailment Detection on Legal Bar Exam Questions

Getting an overview over the legal domain has become challenging, especially in a broad, international context. Legal question answering systems have the potential to alleviate this task by automatically retrieving relevant legal texts for a specific statement and checking whether the meaning of the statement can be inferred from the found documents. We investigate a combination of the BM25 scoring method of Elasticsearch with word embeddings trained on English translations of the German and Japanese civil law. For this, we define criteria which select a dynamic number of relevant documents according to threshold scores. Exploiting two deep learning classifiers and their respective prediction bias with a threshold-based answer inclusion criterion has shown to be beneficial for the textual entailment task, when compared to the baseline.

Gunter Saake | Wolfram Fenske | Sayed Anisul Hoque | Sabine Wehnert | G. Saake | W. Fenske | Sabine Wehnert

[1] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[2] Quang-Thuy Ha,et al. Refining the Judgment Threshold to Improve Recognizing Textual Entailment Using Similarity , 2012, ICCCI.

[3] Dmitry Yarotsky,et al. Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.

[4] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[5] Benno Stein,et al. Strategies for retrieving plagiarized documents , 2007, SIGIR.

[6] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7] Arnab Bhattacharya,et al. Overview of the FIRE 2017 IRLeD Track: Information Retrieval from Legal Documents , 2017, FIRE.

[8] Ansgar Scherp,et al. Word Embeddings for Practical Information Retrieval , 2017, GI-Jahrestagung.

[9] Gareth J. F. Jones,et al. Challenges in the Development of Effective Systems for Professional Legal Search , 2018, ProfS/KG4IR/Data:Search@SIGIR.

[10] Philip S. Taylor,et al. An investigation into the application of ensemble learning for entailment classification , 2014, Inf. Process. Manag..

[11] Phil Blunsom,et al. Reasoning about Entailment with Neural Attention , 2015, ICLR.

[12] Yang Liu,et al. Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[13] Randy Goebel,et al. COLIEE-2018: Evaluation of the Competition on Case Law Information Extraction and Entailment , 2018 .

[14] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .

[15] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17] Diana Inkpen,et al. Semantic text similarity using corpus-based word similarity and string similarity , 2008, ACM Trans. Knowl. Discov. Data.

[18] Livio Robaldo,et al. Legal Information Retrieval Using Topic Clustering and Neural Networks , 2017, COLIEE@ICAIL.

[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20] Christopher Potts,et al. A large annotated corpus for learning natural language inference , 2015, EMNLP.

[21] Randy Goebel,et al. COLIEE-2018: Evaluation of the Competition on Legal Information Extraction and Entailment , 2018, JSAI-isAI Workshops.

[22] Ido Dagan,et al. Web Based Probabilistic Textual Entailment , 2005 .

[23] Danqi Chen,et al. CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[24] Yoshinobu Kano,et al. Overview of Japanese Statute Law Retrieval and Entailment Task at COLIEE-2018 , 2018 .

[25] Mohit Bansal,et al. Shortcut-Stacked Sentence Encoders for Multi-Domain Inference , 2017, RepEval@EMNLP.

[26] Hugo Zaragoza,et al. The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[27] Minh-Tien Nguyen,et al. Lexical-Morphological Modeling for Legal Text Analysis , 2015, JSAI-isAI Workshops.

[28] Bartosz Krawczyk,et al. Leveraging Ensemble Pruning for Imbalanced Data Classification , 2018, 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[29] Matt J. Kusner,et al. From Word Embeddings To Document Distances , 2015, ICML.

[30] Johannes Schmidt-Hieber,et al. Nonparametric regression using deep neural networks with ReLU activation function , 2017, The Annals of Statistics.

[31] Leilei Kong,et al. HLJIT2017@IRLed-FIRE2017: Information Retrieval From Legal Documents , 2017, FIRE.

[32] Minh-Tien Nguyen,et al. Legal Question Answering using Ranking SVM and Deep Convolutional Neural Network , 2017, ArXiv.