EdVidParse : detecting people and content ineducational videos; Detecting people and content in educationalvideos
暂无分享,去创建一个
There are thousands of hours of educational content
on the Internet, with services like edX, Coursera, Berkeley
WebCasts, and others offering hundreds of courses to hundreds of
thousands of learners. Consequently, researchers are interested in
the effectiveness of video learning. While educational videos vary,
they share two common attributes: people and textual content.
People are presenting content to learners in the form of text,
graphs, charts, tables, and diagrams. With an annotation of people
and textual content in an educational video, researchers can study
the relationship between video learning and retention. This thesis
presents EdVidParse, an automatic tool that takes an educational
video and annotates it with bounding boxes around the people and
textual content. EdVidParse uses internal features from deep
convolutional neural networks to estimate the bounding boxes,
achieving a 0.43 AP score on a test set. Three applications of
EdVidParse, including identifying the video type, identifying
people and textual content for interface design, and removing a
person from a picture-in-picture video are presented. EdVidParse
provides an easy interface for identifying people and textual
content inside educational videos for use in video annotation,
interface design, and video reconfiguration.