Separating transparent layers of repetitive dynamic behaviors

In this paper, we present an approach for separating two transparent layers of complex nonrigid scene dynamics. The dynamics in one of the layers is assumed to be repetitive, while the other can have any arbitrary dynamics. Such repetitive dynamics includes, among other, human actions in video (e.g., a walking person), or a repetitive musical tune in audio signals. We use a global to local space time alignment approach to detect and align the repetitive behavior. Once aligned, a median operator applied to space time derivatives is used to recover the intrinsic repeating behavior, and separate it from the other transparent layer. We show results on synthetic and real video sequences. In addition, we show the applicability of our approach to separating mixed audio signals (from a single source).

[1]  Michal Irani,et al.  Aligning Sequences and Actions by Maximizing Space-Time Correlations , 2006, ECCV.

[2]  Michael J. Black,et al.  The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields , 1996, Comput. Vis. Image Underst..

[3]  Michal Irani,et al.  Separating Transparent Layers through Layer Information Exchange , 2004, ECCV.

[4]  Yair Weiss,et al.  Deriving intrinsic images from image sequences , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Michal Irani,et al.  Motion Analysis for Image Enhancement: Resolution, Occlusion, and Transparency , 1993, J. Vis. Commun. Image Represent..

[6]  Richard Szeliski,et al.  Stereo matching with reflections and translucency , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Te-Won Lee,et al.  Single channel signal separation using MAP-based subspace decomposition , 2003 .

[8]  Sam T. Roweis,et al.  One Microphone Source Separation , 2000, NIPS.

[9]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[10]  Richard Szeliski,et al.  Layer extraction from multiple images containing reflections and transparency , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[11]  P. Anandan,et al.  Robust multi-sensor image alignment , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  Andrew W. Fitzgibbon,et al.  Bayesian Estimation of Layers from Multiple Images , 2002, ECCV.