SlideGen: an abstractive section-based slide generator for scholarly documents

Presentation slides generated from research papers provide summary of the papers primarily to guide talks. Manually generating presentation slides is labor intensive. We propose a method to automatically generate slides for scientific articles based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites which is the largest dataset used for scholarly article summarization. We generate slides 1) extractively by selecting salient sentences from the paper and 2) abstractively by fine-tuning pre-trained language models to learn the language of slides. The results show the superiority of the extractive models in terms of ROUGE scores. However, abstractive summaries are less verbose and follow the language of the slides by generating phrases rather than full sentences.