论文信息 - Classification and clustering for case-based criminal summary judgments

Classification and clustering for case-based criminal summary judgments

We investigate the effectiveness of machine-generated criteria for classification problems related to criminal summary judgments. Our system utilizes documents of closed lawsuits as training data for generating keyword-based and case-based classification criteria, and applies these machine-generated criteria for the classification tasks. To construct databases of the classification criteria, we employ different levels of lexical knowledge in extracting information from legal documents in Chinese, and build a case instance for each closed lawsuit. Experimental results indicate that case-based classification outperforms keyword-based classification, and that machine-generated cases may offer performance accuracy that is about 7% below that of human-provided cases. Hoping to boost inference efficiency of our classifiers, we also design methods that merge the machine-generated criteria. Empirical results show that our methods can maintain the classification quality within 20% of the quality achieved by human-provided cases, even when we aggressively reduce the number of previously machine-generated cases by about seventy percents.

Chao-Lin Liu | Cheng-Tsung Chang | Jim-How Ho

[1] Radboud Winkels,et al. Automated legislative drafting: generating paraphrases of legislation , 1995, ICAIL '95.

[2] Daniel S. Hirschberg,et al. Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[3] Fiorenza Socci,et al. A thesaurus for improving information retrieval in an integrated legal expert system , 1998, Proceedings Ninth International Workshop on Database and Expert Systems Applications (Cat. No.98EX130).

[4] Finn V. Jensen,et al. Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[5] Gian Piero Zarri. Semantic Web and knowledge representation , 2002, Proceedings. 13th International Workshop on Database and Expert Systems Applications.

[6] Keh-Jiann Chen,et al. Unknown Word Detection for Chinese by a Corpus-based Learning Method , 1998, ROCLING/IJCLCLP.

[7] Christiane Fellbaum,et al. Using Wordnet for Text Retrieval , 1998 .

[8] Kevin D. Ashley. Modeling legal argument - reasoning with cases and hypotheticals , 1991, Artificial intelligence and legal reasoning.

[9] Graham Brown. CHINATAX: exploring isomorphism with chinese law , 1993, ICAIL '93.

[10] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[11] David W. Aha,et al. Instance-Based Learning Algorithms , 1991, Machine Learning.

[12] Charles B. Callaway,et al. Automating Judicial Document Drafting: A Discourse-Based Approach , 1998 .

[13] Gwyneth Tseng,et al. Chinese text segmentation for text retrieval: achievements and problems , 1993 .

[14] Keh-Jiann Chen,et al. An Efficient Natural Language Processing System Specially Designed for the Chinese Language , 1991, Comput. Linguistics.

[15] Vincent A. W. M. M. Aleven,et al. Teaching case-based argumentation through a model and examples , 1997 .

[16] Carole D. Hafner,et al. The role of context in case-based legal reasoning: teleological, temporal, and procedural , 2002, Artificial Intelligence and Law.

[17] Erich Schweighofer. The Revolution in Legal Information Retrieval or: The Empire Strikes Back , 1999, Journal of Information, Law and Technology.

[18] Trevor J. M. Bench-Capon,et al. Ontologies in legal information systems; the need for explicit specifications of domain conceptualisations , 1997, ICAIL '97.

[19] Werner Winiwarter,et al. Exploratory analysis of concept and document spaces with connectionist networks , 1999, Artificial Intelligence and Law.

[20] Chao-Lin Liu,et al. Ontology-based Text Summarization for Business News Articles , 2003, CATA.

[21] Kevin D. Ashley,et al. Toward adding knowledge to learning algorithms for indexing legal cases , 1999, ICAIL '99.

[22] Kevin D. Ashley,et al. Improving the representation of legal case texts with information extraction methods , 2001, ICAIL '01.

[23] Changning Huang,et al. Dependency-based Syntactic Analysis of Chinese and Annotation of Parsed Corpus , 2000, ACL.

[24] Peter Ebenhoch. Legal knowledge representation using the resource description framework (RDF) , 2001, 12th International Workshop on Database and Expert Systems Applications.

[25] Steven Salzberg,et al. A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[26] Marie-Francine Moens,et al. Abstracting of legal cases: the SALOMON experience , 1997, ICAIL '97.

[27] Zhao-Ming Gao,et al. A Hybrid Approach for Automatic Classification of Chinese Unknown Verbs , 2002, Int. J. Comput. Linguistics Chin. Lang. Process..

[28] Anil K. Jain,et al. Data clustering: a review , 1999, CSUR.

[29] Uri J. Schild,et al. Intelligent computer systems for criminal sentencing , 1995, ICAIL '95.

[30] Robert W. van Kralingen,et al. Bringing IT support for legislative drafting one step further: from drafting support to design assistance , 1997, ICAIL '97.

[31] Trevor J. M. Bench-Capon,et al. Open texture and ontologies in legal information systems , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[32] Thomas Wetter,et al. A natural language based legal expert system for consultation and tutoring—the LEX project , 1987, ICAIL '87.

[33] Paul Thompson. Automatic categorization of case law , 2001, ICAIL '01.

[34] Cyrus Tata,et al. Decision support for sentencing in a common law jurisdiction , 1995, ICAIL '95.

[35] Anandeep Pannu,et al. Using genetic algorithms to inductively reason with cases in the legal domain , 1995, ICAIL '95.

[36] Jacky Legrand,et al. A contribution to indexing in legal information retrieval , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[37] Rosina O. Weber. Intelligent jurisprudence research: a new concept , 1999, ICAIL '99.