Shot Boundary Detection Approaches: • UIowa05SB01 – by frame histogram similarity • UIowa05SB02 – by frame pixel distance similarity • UIowa05SB03 – by frame histogram * distance (product) • UIowa05SB04 – by frame HSB similarity • UIowa05SB05 – by frame pixel distance & HSB • UIowa05SB06 – by frame product & HSB Failure to remove error induced in last year’s results makes any conclusions problematic. Low Level Feature Extraction Approach – sliding region window with pixel distance similarity, aggregated with run length threshold: • UIowa05LF01 – run length of 5 frames, window range of 5 pixels +telltale location • UIowa05LF02 – run length of 5 frames, window range of 10 pixels +telltale location • UIowa05LF03 – run length of 10 frames, window range of 5 pixels +telltale location • UIowa05LF04 – run length of 10 frames, window range of 10 pixels +telltale location No distinction in performance for task as defined. False negatives are typically fast pans/tilts resulting in window overruns. Zoom logic is typically the cause of false positives for pans and tilts. We definitely need to rework our zoom logic and address coarse motion. (Automatic) Search Approach – fully automatic search involving two different architectures, one TDT-derived (UIowa05ASxx) and one SVM-based (uiDJx): • UIowa05AS01 – text-only, named entity vector matches against provided sample • UIowa05AS02 – text-only, named entity vector matches against provided sample • UIowa05AS03 – key frame pixel distance similarity with provided samples • uiDJ1 : color information only, based on HSB color space and is calculated from average hue, saturation and brightness values from given image • uiDJ2 : edge information only, based on Canny’s edge detection algorithm with a global edge ratio • uiDJ3 : texture information only, based on Gray Level Cooccurrence Matrices (GLCM), which provides angular second moment, contrast, correlation, inverse difference moment and entropy measures Text-only runs much more effective than those based upon key frames. Text-only performance on TRECVID-style topics quite variable. Next steps are to look to more frames in image-based comparison schemes. Boundaries, Motion and Automatic Search at The University of Iowa 1 – Shot Boundary Detection As described for previous workshops [3, 4], our shot boundary work was based upon three core techniques: histogram similarity, aggregate pixel distance and aggregate edge distance. Our composite HSB technique first does a histogram-based cut detection and then overlays that with an averaged HSB gradual detection, with graduals trumping any contained cuts. Our official runs for this year (Table 1) still exhibit performance issues similar to those reported last year. Our assumption last year was that test logic accidently left on for the evaluation runs had deleted the majority of the boundary declarations. Given the only change from last year to this was attention to the disabling of this flag, it appears instead that integration of our new gradual transition logic is actually the root cause of our performance degradation or that from 2003 (as shown in Table 2). While the transition logic has improved gradual precision, frame recall and frame precision, it’s clearly damaging overall.... 2 – Low-Level Feature Extraction The detection of low level camera motion has substantial potential utility as a building block in the construction of higher level features. Given this, our primary interest in the detection of camera motion is not the declaration of presence or absence of a given motion in a given shot (i.e., the task as defined this year), but rather the recognition of Table 1: Shot Boundary Task, Overall Results 2005 Run Method All Cuts Gradual Rec Prec Rec Prec Rec Prec F-Rec F-Prec UIowa05SB01 histogram 0.097 0.166 0.124 0.164 0.016 0.241 0.364 0.615 UIowa05SB02 distance 0.154 0.306 0.207 0.307 0.000 0.000 0.000 0.000 UIowa05SB03 product 0.274 0.256 0.355 0.261 0.039 0.160 0.275 0.653 UIowa05SB04 hsb 0.055 0.232 0.010 0.256 0.185 0.228 0.548 0.717 UIowa05SB05 distance & hsb 0.192 0.318 0.206 0.299 0.152 0.418 0.573 0.790 UIowa05SB06 product & hsb 0.289 0.273 0.348 0.265 0.117 0.369 0.460 0.841 Table 2: Shot Boundary Retrospective Year Method All Cuts Gradual Rec Prec Rec Prec Rec Prec F-Rec F-Prec 2003 histogram 0.445 0.804 0.554 0.937 0.178 0.389 0.234 0.96
[1]
Ramin Zabih,et al.
A feature-based algorithm for detecting and classifying scene breaks
,
1995,
MULTIMEDIA '95.
[2]
Thomas G. Dietterich.
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization
,
2000,
Machine Learning.
[3]
Dong-Jun Park,et al.
Experiments in Boundary Recognition at the University of Iowa
,
2003,
TRECVID.
[4]
Dong-Jun Park,et al.
Boundary and Feature Recognition at the University of Iowa
,
2004,
TRECVID.
[5]
Jonathan G. Fiscus,et al.
A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)
,
1997,
1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.