Interpretable Approach in the Classification of Sequences of Legal Texts

Machine learning applications in the legal field are numerous and diverse. In order to make contribution to both the machine learning community and the legal community, we have made efforts to create a model compatible with the classification of text sequences, valuing the interpretability of the results. The purpose of this paper is to classify Brazilian legal proceedings in three possible status classes, which are (i) archived proceedings, (ii) active proceedings and (iii) suspended proceedings. Although working with portuguese NLP, which can be hard due to lack of resources, our approach performed remarkably well in the classification task. Furthermore, we were able to extract and interpret the patterns learnt by the neural network besides quantifying how those patterns relate to the classification task.

[1]  Rodrigo Nogueira,et al.  Portuguese Named Entity Recognition using BERT-CRF , 2019, ArXiv.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Claude Berrou,et al.  Machine learning for explaining and ranking the most influential matters of law , 2019, ICAIL.

[4]  Gustavo Carvalho,et al.  Document type classification for Brazil’s supreme court using a Convolutional Neural Network , 2018, ICoFCS-2018.

[5]  Kevin D. Ashley,et al.  Using Factors to Predict and Analyze Landlord-Tenant Decisions to Increase Access to Justice , 2019, ICAIL.

[6]  Nilton Correia da Silva,et al.  Document classification using a Bi-LSTM to unclog Brazil's supreme court , 2018, ArXiv.

[7]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[8]  Ion Androutsopoulos,et al.  Obligation and Prohibition Extraction Using Hierarchical RNNs , 2018, ACL.

[9]  Josef van Genabith,et al.  Exploring the Use of Text Classification in the Legal Domain , 2017, ASAIL@ICAIL.

[10]  Felipe Maia Polo,et al.  Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data , 2020, ArXiv.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[13]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14]  Akira Shimazu,et al.  Recurrent neural network-based models for recognizing requisite and effectuation parts in legal texts , 2018, Artificial Intelligence and Law.

[15]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[16]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[17]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[18]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..