A cooperative top-down/bottom-up technique for motion field segmentation

The segmentation of video sequences into regions underlying a coherent motion is one of the most useful processing for video analysis and coding. In this paper, we propose an algorithm that exploits the advantages of both top-down and bottom-up techniques for motion field segmentation. To remove camera motion, a global motion estimation and compensation is first performed. Local motion estimation is then carried out relying on a traslational motion model. Starting from this motion field, a two-stage analysis based on affine models takes place. In the first stage, using a top-down segmentation technique, macro-regions with coherent affine motion are extracted. In the second stage, the segmentation of each macro-region is refined using a bottom-up approach based on a motion vector clustering. In order to further improve the accuracy of the spatio-temporal segmentation, a Markov Random Field (MRF)-inspired motion-and-intensity based refinement step is performed to adjust objects boundaries.