论文信息 - Machine Learning versus Knowledge Based Classification of Legal Texts

Machine Learning versus Knowledge Based Classification of Legal Texts

This paper presents results of an experiment in which we used machine learning (ML) techniques to classify sentences in Dutch legislation. These results are compared to the results of a pattern-based classifier. Overall, the ML classifier performs as accurate (>90%) as the pattern based one, but seems to generalize worse to new laws. Given these results, the pattern based approach is to be preferred since its reasons for classification are clear and can be used for further modelling of the content of the sentences.

[1] Andrea Passerini,et al. Automatic Classification of Provisions in Legislative Texts , 2007, Artificial Intelligence and Law.

[2] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[3] Radboud Winkels,et al. Suggesting Model Fragments for Sentences in Dutch Laws , 2010 .

[4] Teresa Gonçalves,et al. Is linguistic information relevant for the classification of legal texts? , 2005, ICAIL '05.

[5] Chris Cornelis,et al. Exploiting Properties of Legislative Texts to Improve Classification Accuracy , 2009, JURIX.

[6] Bart Verheij,et al. About the logical relations between cases and rules , 2008, JURIX.

[7] Radboud Winkels,et al. A next step towards automated modelling of sources of law , 2009, ICAIL.

[8] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.

[9] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[10] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.

[11] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[12] Radboud Winkels,et al. Categorisation of Norms , 2007, JURIX.