Using maximum entropy for sentence extraction

A maximum entropy classifier can be used to extract sentences from documents. Experiments using technical documents show that such a classifier tends to treat features in a categorical manner. This results in performance that is worse than when extracting sentences using a naive Bayes classifier. Addition of an optimised prior to the maximum entropy classifier improves performance over and above that of naive Bayes (even when naive Bayes is also extended with a similar prior). Further experiments show that, should we have at our disposal extremely informative features, then maximum entropy is able to yield excellent results. Naive Bayes, in contrast, cannot exploit these features and so fundamentally limits sentence extraction performance.

[1]  Stanley F. Chen,et al.  A Gaussian Prior for Smoothing Maximum Entropy Models , 1999 .

[2]  Daniel Marcu,et al.  The automatic construction of large-scale corpora for summarization research , 1999, SIGIR '99.

[3]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[4]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[6]  Branimir Boguraev,et al.  The effects of analysing cohesion on document summarisation , 2000, COLING.

[7]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[8]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[9]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[10]  Branimir Boguraev,et al.  Discourse segmentation in aid of document summarization , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[11]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[12]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[13]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[14]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[15]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[16]  Jade Goldstein-Stewart,et al.  Summarizing text documents: sentence selection and evaluation metrics , 1999, SIGIR '99.

[17]  Task-Based Evaluation of Summary Quality: Describing Relationships between Scientific Papers , 2001 .