Aspect Classification for Legal Depositions

Attorneys and others have a strong interest in having a digital library with suitable services (e.g., summarizing, searching, and browsing) to help them work with large corpora of legal depositions. Their needs often involve understanding the semantics of such documents. That depends in part on the role of the deponent, e.g., plaintiff, defendant, law enforcement personnel, expert, etc. In the case of tort litigation associated with property and casualty insurance claims, such as relating to an injury, it is important to know not only about liability, but also about events, accidents, physical conditions, and treatments. We hypothesize that a legal deposition consists of various aspects that are discussed as part of the deponent testimony. Accordingly, we developed an ontology of aspects in a legal deposition for accident and injury cases. Using that, we have developed a classifier that can identify portions of text for each of the aspects of interest. Doing so was complicated by the peculiarities of this genre, e.g., that deposition transcripts generally consist of data in the form of question-answer (QA) pairs. Accordingly, our automated system starts with pre-processing, and then transforms the QA pairs into a canonical form made up of declarative sentences. Classifying the declarative sentences that are generated, according to the aspect, can then help with downstream tasks such as summarization, segmentation, question-answering, and information retrieval. Our methods have achieved a classification F1 score of 0.83. Having the aspects classified with a good accuracy will help in choosing QA pairs that can be used as candidate summary sentences, and to generate an informative summary for legal professionals or insurance claim agents. Our methodology could be extended to legal depositions of other kinds, and to aid services like searching.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[3]  I. Gilboa,et al.  Linear Measures, the Gini Index, and The Income-Equality Trade-off , 1994 .

[4]  Elmar Nöth,et al.  Dialog act classification with the help of prosody , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Daniel Jurafsky,et al.  Lexical, Prosodic, and Syntactic Cues for Dialog Acts , 1998 .

[8]  Sergio Cerutti,et al.  Entropy, entropy rate, and pattern classification as tools to typify complexity in short heart period variability series , 2001, IEEE Transactions on Biomedical Engineering.

[9]  Irina Rish,et al.  An empirical study of the naive Bayes classifier , 2001 .

[10]  Marko Grobelnik,et al.  Interaction of Feature Selection Methods and Linear Classification Models , 2002 .

[11]  Rosalind W. Picard,et al.  Dialog Act Classification from Prosodic Features Using Support Vector Machines , 2002 .

[12]  Andreas Stolcke,et al.  Training a prosody-based dialog act tagger from unlabeled data , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Jeff A. Bilmes,et al.  Dialog act tagging using graphical models , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[14]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[15]  Sotiris Kotsiantis,et al.  Text Classification Using Machine Learning Techniques , 2005 .

[16]  Yorick Wilks,et al.  Dialogue Act Classification Based on Intra-Utterance Features∗ , 2005 .

[17]  Yang Liu Using SVM and error-correcting codes for multiclass dialog act classification in meeting corpus , 2006, INTERSPEECH.

[18]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[19]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[20]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[21]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[22]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[23]  Pavel Král,et al.  Automatic dialogue act recognition with syntactic features , 2014, Language Resources and Evaluation.

[24]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[25]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Edward A. Fox,et al.  Classifying Short Unstructured Data Using the Apache Spark Platform , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[30]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[31]  Stefan Wermter,et al.  A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks , 2018, LREC.

[32]  Edward A. Fox,et al.  Dialog Acts Classification for Question-Answer Corpora , 2019, ASAIL@ICAIL.

[33]  Edward A. Fox,et al.  Improving the Processing of Question Answer Based Legal Documents , 2019, International Conference on Legal Knowledge and Information Systems.

[34]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.