Plain English Summarization of Contracts

Unilateral contracts, such as terms of service, play a substantial role in modern digital life. However, few users read these documents before accepting the terms within, as they are too long and the language too complicated. We propose the task of summarizing such legal documents in plain English, which would enable users to have a better understanding of the terms they are accepting. We propose an initial dataset of legal text snippets paired with summaries written in plain English. We verify the quality of these summaries manually and show that they involve heavy abstraction, compression, and simplification. Initial experiments show that unsupervised extractive summarization methods do not perform well on this task due to the level of abstraction and style differences. We conclude with a call for resource and technique development for simplification and style transfer for legal language.

[1]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[2]  Yuen-Hsien Tseng,et al.  Text mining techniques for patent analysis , 2007, Inf. Process. Manag..

[3]  Lucia Specia Translating from Complex to Simplified Sentences , 2010, PROPOR.

[4]  Anne Oeldorf-Hirsch,et al.  The Biggest Lie on the Internet: Ignoring the Privacy Policies and Terms of Service Policies of Social Networking Services , 2020 .

[5]  Jungo Kasai,et al.  ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks , 2019, AAAI.

[6]  Chris Callison-Burch,et al.  Optimizing Statistical Machine Translation for Text Simplification , 2016, TACL.

[7]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[8]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[9]  Min-Yen Kan,et al.  Overview of the CL-SciSumm 2016 Shared Task , 2016, BIRNDL@JCDL.

[10]  Mirella Lapata,et al.  Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming , 2011, EMNLP.

[11]  Peter G. Klein,et al.  Cori K-Base: Data Overview , 2007 .

[12]  Paul Compton,et al.  Combining Different Summarization Techniques for Legal Text , 2012 .

[13]  Bo Gao,et al.  PatentMiner: topic-driven patent analysis and mining , 2012, KDD.

[14]  Eric P. Xing,et al.  Unsupervised Text Style Transfer using Language Models as Discriminators , 2018, NeurIPS.

[15]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[16]  Guy Lapalme,et al.  Legal Text Summarization by Exploration of the Thematic Structure and Argumentative Roles , 2004 .

[17]  Mirella Lapata,et al.  Sentence Simplification with Deep Reinforcement Learning , 2017, EMNLP.

[18]  M. Coleman,et al.  A computer readability formula designed for machine scoring. , 1975 .

[19]  Andrei Marmor The Language of Law , 2014 .

[20]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[21]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[22]  U. Berkeley Exploring Content Models for Multi-Document Summarization , 2018 .

[23]  E A Smith,et al.  Automated readability index. , 1967, AMRL-TR. Aerospace Medical Research Laboratories.

[24]  Lucia Specia,et al.  Unsupervised Lexical Simplification for Non-Native Speakers , 2016, AAAI.

[25]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[26]  Percy Liang,et al.  Delete, Retrieve, Generate: a Simple Approach to Sentiment and Style Transfer , 2018, NAACL.

[27]  Chris Callison-Burch,et al.  Problems in Current Text Simplification Research: New Data Can Help , 2015, TACL.

[28]  Shashi Narayan,et al.  Unsupervised Sentence Simplification Using Deep Semantics , 2015, INLG.

[29]  Claire Grover,et al.  Extractive summarisation of legal texts , 2006, Artificial Intelligence and Law.

[30]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[31]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[32]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[33]  Goran Glavas,et al.  Simplifying Lexical Simplification: Do We Need Simplified Corpora? , 2015, ACL.

[34]  Ani Nenkova,et al.  Identification and Characterization of Newsworthy Verbs in World News , 2022 .

[35]  Regina Barzilay,et al.  Style Transfer from Non-Parallel Text by Cross-Alignment , 2017, NIPS.

[36]  Paul Over,et al.  DUC in context , 2007, Inf. Process. Manag..