论文信息 - Exploiting Properties of Legislative Texts to Improve Classification Accuracy

Exploiting Properties of Legislative Texts to Improve Classification Accuracy

Organizing legislative texts into a hierarchy of legal topics enhances the access to legislation. Manually placing every part of new legislative texts in the correct place of the hierarchy, however, is expensive and slow, and therefore naturally calls for automation. In this paper, we assess the ability of machine learning methods to develop a model that automatically classifies legislative texts in a legal topic hierarchy. It is investigated whether such methods can generalize across different codes. In the classification process, the specific properties of legislative documents are exploited. Both the hierarchical structure of legal codes and references within the legal document collection are taken into account. We argue for a closer cooperation between legal and machine learning experts as the main direction of future work.

Chris Cornelis | Greet Van Eetvelde | Geert De Meyer | Rob Opsomer

[1] Greg Schohn,et al. Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[2] Ian Witten,et al. Data Mining , 2000 .

[3] Marie-Francine Moens,et al. Innovative techniques for legal text retrieval , 2001, Artificial Intelligence and Law.

[4] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5] Gerard Salton,et al. Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[6] Daphne Koller,et al. Hierarchically Classifying Documents Using Very Few Words , 1997, ICML.

[7] Piotr Indyk,et al. Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[8] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[9] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[10] Monica Palmirani,et al. Automated extraction of normative references in legal texts , 2003, ICAIL.

[11] Fabrizio Sebastiani,et al. Machine learning in automated text categorization , 2001, CSUR.

[12] Tom M. van Engers,et al. Automated Detection of Reference Structures in Law , 2006, JURIX.