Identifying Relationships Among Sentences in Court Case Transcripts Using Discourse Relations

Case Law has a significant impact on the proceedings of legal cases. Therefore, the information that can be obtained from previous court cases is valuable to lawyers and other legal officials when performing their duties. This paper describes a methodology of applying discourse relations between sentences when processing text documents related to the legal domain. In this study, we developed a mechanism to classify the relationships that can be observed among sentences in transcripts of United States court cases. First, we defined relationship types that can be observed between sentences in court case transcripts. Then we classified pairs of sentences according to the relationship type by combining a machine learning model and a rule-based approach. The results obtained through our system were evaluated using human judges. To the best of our knowledge, this is the first study where discourse relationships between sentences have been used to determine relationships among sentences in legal court case transcripts.

[1]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[2]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[3]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[4]  Marie-Francine Moens,et al.  Automatic detection of arguments in legal texts , 2007, ICAIL.

[5]  Fumiyo Fukumoto,et al.  Exploiting Discourse Relations between Sentences for Text Clustering , 2012 .

[6]  Claire Grover,et al.  Extractive summarisation of legal texts , 2006, Artificial Intelligence and Law.

[7]  Paul Piwek,et al.  Generating Expository Dialogue from Monologue: Motivation, Corpus and Preliminary Rules , 2010, NAACL.

[8]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[9]  William C. Mann,et al.  RHETORICAL STRUCTURE THEORY: A THEORY OF TEXT ORGANIZATION , 1987 .

[10]  Kenneth C. Litkowski,et al.  CL Research Experiments in TREC-10 Question Answering , 2001, TREC.

[11]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[12]  Keet Sugathadasa,et al.  Deriving a representative vector for ontology classes with instance word vector embeddings , 2017, 2017 Seventh International Conference on Innovative Computing Technology (INTECH).

[13]  Keet Sugathadasa,et al.  Synergistic union of Word2Vec and lexicon for domain specific semantic similarity , 2017, 2017 IEEE International Conference on Industrial and Information Systems (ICIIS).

[14]  Claire Grover,et al.  A Rhetorical Status Classifier for Legal Text Summarisation , 2004 .

[15]  Dragomir R. Radev A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-Document Structure , 2000, SIGDIAL Workshop.

[16]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[17]  Keet Sugathadasa,et al.  Legal Document Retrieval using Document Vector Embeddings and Deep Learning , 2018, Advances in Intelligent Systems and Computing.

[18]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[19]  Christopher D. Manning,et al.  Entity-Centric Coreference Resolution with Model Stacking , 2015, ACL.

[20]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[21]  Ani Nenkova,et al.  Discourse indicators for content selection in summarization , 2010, SIGDIAL Conference.

[22]  Thiago A. S. Pardo,et al.  Experiments with CST-Based Multidocument Summarization , 2010, TextGraphs@ACL.

[23]  Marie-Francine Moens,et al.  Information extraction from legal texts: the potential of discourse analysis , 1999, Int. J. Hum. Comput. Stud..

[24]  Lou Boves,et al.  Discourse-based answering of why-questions , 2006, Trait. Autom. des Langues.

[25]  Keet Sugathadasa,et al.  Semi-supervised instance population of an ontology using word vector embedding , 2017, 2017 Seventeenth International Conference on Advances in ICT for Emerging Regions (ICTer).

[26]  Latesh G. Malik,et al.  Word net based Method for Determining Semantic Sentence Similarity through various Word Senses , 2014, ICON.

[27]  Keet Sugathadasa,et al.  Word Vector Embeddings and Domain Specific Semantic based Semi-Supervised Ontology Instance Population , 2018 .

[28]  Fumiyo Fukumoto,et al.  EXPLOITING RHETORICAL RELATIONS TO MULTIPLE DOCUMENTS TEXT SUMMARIZATION , 2015 .

[29]  Vinícius Rodrigues Uzêda,et al.  A comprehensive summary informativeness evaluation for RST-based summarization methods , 2009 .

[30]  Zhu Zhang,et al.  Towards CST-enhanced summarization , 2002, AAAI/IAAI.