Supervised Machine Learning for Summarizing Legal Documents

This paper presents a supervised machine learning approach for summarizing legal documents A commercial system for the analysis and summarization of legal documents provided us with a corpus of almost 4,000 text and extract pairs for our machine learning experiments That corpus was pre-processed to identify the selected source sentences in extracts from which we generated legal structured data We finally describe our sentence classification experiments relying on a Naive Bayes classifier using a set of surface, emphasis, and content features.