PPSGen: Learning to Generate Presentation Slides for Academic Papers

In this paper, we investigate a very challenging task of automatically generating presentation slides for academic papers. The generated presentation slides can be used as drafts to help the presenters prepare their formal slides in a quicker way. A novel system called PPSGen is proposed to address this task. It first employs regression methods to learn the importance of the sentences in an academic paper, and then exploits the integer linear programming (ILP) method to generate well-structured slides by selecting and aligning key phrases and sentences. Evaluation results on a test set of 200 pairs of papers and slides collected on the web demonstrate that our proposed PPSGen system can generate slides with better quality. A user study is also illustrated to show that PPSGen has a few evident advantages over baseline methods.

[1]  Sadao Kurohashi,et al.  Automatic Slide Generation Based on Discourse Structure Analysis , 2005, IJCNLP.

[2]  Benoit Favre,et al.  A Scalable Global Model for Summarization , 2009, ILP 2009.

[3]  Mitsuru Ishizuka,et al.  Making Topic-Specific Report and Multimodal Presentation Automatically by Mining the Web Resources , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  P. Sreenivasa Kumar,et al.  SlidesGen: Automatic Generation of Presentation Slides for a Technical Paper Using Summarization , 2009, FLAIRS Conference.

[6]  Dimitrios Galanis,et al.  AUEB at TAC 2008 , 2008, TAC.

[7]  Hasida Kôiti,et al.  Automatic slide presentation from semantically annotated documents , 1999, ACL 1999.

[8]  Min-Yen Kan SlideSeer: a digital library of aligned document and presentation pairs , 2007, JCDL '07.

[9]  Dilek Z. Hakkani-Tür,et al.  The ICSI Summarization System at TAC 2008 , 2008, TAC.

[10]  Roxana Girju,et al.  Investigating Automatic Alignment Methods for Slide Generation from Academic Papers , 2009, CoNLL.

[11]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[12]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[13]  Mitsuru Ishizuka,et al.  'Auto-Presentation': a multi-agent system for building automatic multi-modal presentation of a topic from World Wide Web information , 2005, IEEE/WIC/ACM International Conference on Intelligent Agent Technology.

[14]  H. Nanba,et al.  Alignment between a technical paper and presentation sheets using a hidden Markov model , 2005, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..

[15]  Ion Androutsopoulos,et al.  Extractive Multi-Document Summarization with Integer Linear Programming and Support Vector Regression , 2012, COLING.

[16]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[17]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[18]  Michael Luck,et al.  A framework for monitoring agent-based normative systems , 2009, AAMAS.

[19]  Mirella Lapata,et al.  Multiple Aspect Summarization Using Integer Linear Programming , 2012, EMNLP.

[20]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[21]  Wenjie Li,et al.  Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[22]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.