Where is the beauty?: retrieving appealing VideoScenes by learning Flickr-based graded judgments

In this paper we describe a system that automatically extracts appealing scenes from a set of broadcasting videos. Unlike traditional computational aesthetic models that try to predict the hardly measurable degree of "beauty", we chose to build a system that retrieves "interesting" scenes. We create a training database of Flickr images annotated with their corresponding Flickr "interestingness" degree. We then extract existing and novel aesthetic/semantic features from the training set. Based on such features, we build a graded-relevance "interestingness" model and we rank the test shots according to their predicted "interestingness".