论文信息 - Sequence modelling for sentence classification in a legal summarisation system

Sequence modelling for sentence classification in a legal summarisation system

We describe a set of experiments using a wide range of machine learning techniques for the task of predicting the rhetorical status of sentences. The research is part of a text summarisation project for the legal domain for which we use a new corpus of judgments of the UK House of Lords. We present experimental results for classification according to a rhetorical scheme indicating a sentence's contribution to the overall argumentative structure of the legal judgments using four learning algorithms from the Weka package (C4.5, naïve Bayes, Winnow and SVMs). We also report results using maximum entropy models both in a standard classification framework and in a sequence labelling framework. The SVM classifier and the maximum entropy sequence tagger yield the most promising results.

Claire Grover | Ben Hachey

[1] Mirella Lapata,et al. Probabilistic Text Structuring: Experiments with Sentence Ordering , 2003, ACL.

[2] Marc Moens,et al. Discourse-level argumentation in scientific articles: human and automatic annotation , 1999 .

[3] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .

[4] K. Krippendorff. Krippendorff, Klaus, Content Analysis: An Introduction to its Methodology . Beverly Hills, CA: Sage, 1980. , 1980 .

[5] Jean Carletta,et al. An annotation scheme for discourse-level argumentation in research articles , 1999, EACL.

[6] Rob Malouf,et al. A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[7] James R. Curran,et al. Investigating GIS and Smoothing for Maximum Entropy Taggers , 2003, EACL.

[8] J. Darroch,et al. Generalized Iterative Scaling for Log-Linear Models , 1972 .

[9] Pat Langley,et al. Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[10] James R. Curran,et al. Language Independent NER using a Maximum Entropy Tagger , 2003, CoNLL.

[11] Klaus Krippendorff,et al. Content Analysis: An Introduction to Its Methodology , 1980 .