Challenges for Information Extraction from Dialogue in Criminal Law

Information extraction and question answering have the potential to introduce a new paradigm for how machine learning is applied to criminal law. Existing approaches generally use tabular data for predictive metrics. An alternative approach is needed for matters of equitable justice, where individuals are judged on a case-by-case basis, in a process involving verbal or written discussion and interpretation of case factors. Such discussions are individualized, but they nonetheless rely on underlying facts. Information extraction can play an important role in surfacing these facts, which are still important to understand. We analyze unsupervised, weakly supervised, and pre-trained models’ ability to extract such factual information from the free-form dialogue of California parole hearings. With a few exceptions, most F1 scores are below 0.85. We use this opportunity to highlight some opportunities for further research for information extraction and question answering. We encourage new developments in NLP to enable analysis and review of legal cases to be done in a post-hoc, not predictive, manner.

[1]  S. Chiappa,et al.  Fairness in Machine Learning , 2020, INNSBDDL.

[2]  Daniel E. Ho,et al.  Affirmative Algorithms: The Legal Grounds for Fairness as Awareness , 2020, ArXiv.

[3]  Mingyang Zhou,et al.  Gunrock 2.0: A User Adaptive Social Conversational System , 2020, ArXiv.

[4]  Jinho D. Choi,et al.  Emora: An Inquisitive Social Chatbot Who Cares For You , 2020, ArXiv.

[5]  Timothy Baldwin,et al.  Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? , 2020, ACL.

[6]  Seung-won Hwang,et al.  SQuAD2-CR: Semi-supervised Annotation for Cause and Rationales for Unanswerability in SQuAD 2.0 , 2020, LREC.

[7]  Claire Cardie,et al.  Dialogue-Based Relation Extraction , 2020, ACL.

[8]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[9]  Erez Shmueli,et al.  Algorithmic Fairness , 2020, ArXiv.

[10]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[11]  Xiaofei Wang,et al.  A Comparative Study on Transformer vs RNN in Speech Applications , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[12]  Jinho D. Choi,et al.  FriendsQA: Open-Domain Question Answering on TV Show Transcripts , 2019, SIGdial.

[13]  Deng Cai,et al.  Charge-Based Prison Term Prediction with Deep Gating Network , 2019, EMNLP.

[14]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[15]  Mihai Surdeanu,et al.  Semi-Supervised Teacher-Student Architecture for Relation Extraction , 2019, SPNLP@NAACL-HLT.

[16]  Daniel E. Ho,et al.  Is Yelp Actually Cleaning Up the Restaurant Industry? A Re-Analysis on the Relative Usefulness of Consumer Reviews , 2019, WWW.

[17]  Ming-Wei Chang,et al.  BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.

[18]  Claire Cardie,et al.  DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension , 2019, TACL.

[19]  Kristen Bell A Stone of Hope: Legal and Empirical Analysis of California Juvenile Lifer Parole Decisions , 2018 .

[20]  Danqi Chen,et al.  CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[21]  Zhiyuan Liu,et al.  Few-Shot Charge Prediction with Discriminative Legal Attributes , 2018, COLING.

[22]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[23]  Jared A. Dunnmon,et al.  Snorkel MeTaL: Weak Supervision for Multi-Task Learning , 2018, DEEM@SIGMOD.

[24]  Jinho D. Choi,et al.  SemEval 2018 Task 4: Character Identification on Multiparty Dialogues , 2018, *SEMEVAL.

[25]  Amita Misra,et al.  SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems , 2018, LREC.

[26]  Mari Ostendorf,et al.  Sounding Board: A User-Centric and Content-Driven Social Chatbot , 2018, NAACL.

[27]  A. Ferguson Illuminating Black Data Policing , 2018 .

[28]  Omer Levy,et al.  Zero-Shot Relation Extraction via Reading Comprehension , 2017, CoNLL.

[29]  William L. Hamilton,et al.  Language from police body camera footage shows racial disparities in officer respect , 2017, Proceedings of the National Academy of Sciences.

[30]  J. Leskovec,et al.  Human Decisions and Machine Predictions , 2017, The quarterly journal of economics.

[31]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[32]  Yu-Hsin Chen,et al.  Character Identification on Multiparty Conversation: Identifying Mentions of Characters in TV Shows , 2016, SIGDIAL Conference.

[33]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[34]  Lindsey Barrett,et al.  Reasonably Suspicious Algorithms: Predictive Policing at the United States Border , 2016 .

[35]  Christopher De Sa,et al.  Data Programming: Creating Large Training Sets, Quickly , 2016, NIPS.

[36]  Sharad Goel,et al.  Personalized risk assessments in the criminal justice system , 2016 .

[37]  K. Young,et al.  Predicting Parole Grants An Analysis of Suitability Hearings for California’s Lifer Inmates , 2016 .

[38]  Heike Adel,et al.  Comparing Convolutional Neural Networks to Traditional Models for Slot Filling , 2016, NAACL.

[39]  Ralph Grishman,et al.  Combining Neural Networks and Log-linear Models to Improve Relation Extraction , 2015, ArXiv.

[40]  Brian Ecker,et al.  Argument Mining: Extracting Arguments from Online Dialogue , 2015, SIGDIAL Conference.

[41]  Angel X. Chang,et al.  SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[42]  Jon M. Kleinberg,et al.  Echoes of power: language effects and power differences in social interaction , 2011, WWW.

[43]  Ramesh Nallapati,et al.  Risk analysis for intellectual property litigation , 2011, ICAIL.

[44]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[45]  L. Vieu,et al.  Subordinating and coordinating discourse relations , 2005 .

[46]  N. McKeown,et al.  A New Direction for Machine Learning in Criminal Law , 2021 .

[47]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[48]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[49]  Zhiyuan Liu,et al.  Legal Judgment Prediction via Topological Learning , 2018, EMNLP.