Video digests: a browsable, skimmable format for informational lecture videos

Increasingly, authors are publishing long informational talks, lectures, and distance-learning videos online. However, it is difficult to browse and skim the content of such videos using current timeline-based video players. Video digests are a new format for informational videos that afford browsing and skimming by segmenting videos into a chapter/section structure and providing short text summaries and thumbnails for each section. Viewers can navigate by reading the summaries and clicking on sections to access the corresponding point in the video. We present a set of tools to help authors create such digests using transcript-based interactions. With our tools, authors can manually create a video digest from scratch, or they can automatically generate a digest by applying a combination of algorithmic and crowdsourcing techniques and then manually refine it as needed. Feedback from first-time users suggests that our transcript-based authoring tools and automated techniques greatly facilitate video digest creation. In an evaluative crowdsourced study we find that given a short viewing time, video digests support browsing and skimming better than timeline-based or transcript-based video players.

[1]  John R. Kender,et al.  Augmented segmentation and visualization for presentation videos , 2005, MULTIMEDIA '05.

[2]  Takeo Kanade,et al.  Video skimming and characterization through the combination of image and language understanding , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[3]  Philip J. Guo,et al.  How video production affects student engagement: an empirical study of MOOC videos , 2014, L@S.

[4]  Michael S. Bernstein,et al.  Soylent: a word processor with a crowd inside , 2010, UIST.

[5]  Walter S. Lasecki,et al.  Real-time captioning by groups of non-experts , 2012, UIST.

[6]  Pei-Yu Chi,et al.  DemoCut: generating concise instructional videos for physical demonstrations , 2013, UIST.

[7]  Brad A. Myers,et al.  Simplifying Video Editing Using Metadata , 2001 .

[8]  Michael G. Christel,et al.  Evolving video skims into useful multimedia abstractions , 1998, CHI.

[9]  Krzysztof Z. Gajos,et al.  Crowdsourcing step-by-step information extraction to enhance existing how-to videos , 2014, CHI.

[10]  Marianne Raynaud Doctor and researcher Hans Rosling shows the best stats you've ever seen (19:50) , 2009 .

[11]  Michael S. Bernstein,et al.  Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[12]  Regina Barzilay,et al.  Bayesian Unsupervised Topic Segmentation , 2008, EMNLP.

[13]  Mark Liberman,et al.  Speaker identification on the SCOTUS corpus , 2008 .

[14]  Adam Finkelstein,et al.  Video tapestries with continuous temporal zoom , 2010, SIGGRAPH 2010.

[15]  Wilmot Li,et al.  Tools for placing cuts and transitions in interview video , 2012, ACM Trans. Graph..

[16]  Igor Malioutov,et al.  Minimum Cut Model for Spoken Lecture Segmentation , 2006, ACL.

[17]  Anoop Gupta,et al.  Auto-summarization of audio-video presentations , 1999, MULTIMEDIA '99.

[18]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[19]  Alon Lavie,et al.  Turker-Assisted Paraphrasing for English-Arabic Machine Translation , 2010, Mturk@HLT-NAACL.

[20]  R. Mayer,et al.  Nine Ways to Reduce Cognitive Load in Multimedia Learning , 2003 .

[21]  Krzysztof Z. Gajos,et al.  Leveraging Video Interaction Data and Content Analysis to Improve Video Learning , 2014 .

[22]  Benno Stein,et al.  Paraphrase acquisition via crowdsourcing and machine learning , 2013, TIST.

[23]  Olivia Buzek,et al.  Error Driven Paraphrase Annotation using Mechanical Turk , 2010, Mturk@HLT-NAACL.

[24]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[25]  Shingo Uchihashi,et al.  An interactive comic book presentation for exploring video , 2000, CHI.

[26]  Sebastian Boring,et al.  #EpicPlay: crowd-sourcing sports video highlights , 2012, CHI.

[27]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .

[28]  Shingo Uchihashi,et al.  Video Manga: generating semantically meaningful video summaries , 1999, MULTIMEDIA '99.

[29]  Freddy Y. Y. Choi Advances in domain independent linear text segmentation , 2000, ANLP.

[30]  Wilmot Li,et al.  Content-based tools for editing audio stories , 2013, UIST.

[31]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[32]  Zygmunt Pizlo,et al.  Automated video program summarization using speech transcripts , 2006, IEEE Transactions on Multimedia.

[33]  Henry A. Kautz,et al.  Real-time crowd labeling for deployable activity recognition , 2013, CSCW.

[34]  Steve Whittaker,et al.  Semantic speech editing , 2004, CHI.

[35]  Lan Du,et al.  Topic Segmentation with a Structured Topic Model , 2013, NAACL.