Conditional Alignment Random Fields for Multiple Motion Sequence Alignment

We consider the multiple time-series alignment problem, typically focusing on the task of synchronizing multiple motion videos of the same kind of human activity. Finding an optimal global alignment of multiple sequences is infeasible, while there have been several approximate solutions, including iterative pairwise warping algorithms and variants of hidden Markov models. In this paper, we propose a novel probabilistic model that represents the conditional densities of the latent target sequences which are aligned with the given observed sequences through the hidden alignment variables. By imposing certain constraints on the target sequences at the learning stage, we have a sensible model for multiple alignments that can be learned very efficiently by the EM algorithm. Compared to existing methods, our approach yields more accurate alignment while being more robust to local optima and initial configurations. We demonstrate its efficacy on both synthetic and real-world motion videos including facial emotions and human activities.

[1]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[2]  Hans-Peter Seidel,et al.  Efficient and Robust Annotation of Motion Capture Data , 2009 .

[3]  Desmond G. Higgins,et al.  Analysis and Comparison of Benchmarks for Multiple Sequence Alignment , 2006, Silico Biol..

[4]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[5]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[6]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[8]  Bodo Rosenhahn,et al.  Analyzing and Evaluating Markerless Motion Tracking Using Inertial Sensors , 2010, ECCV Workshops.

[9]  Radford M. Neal,et al.  Multiple Alignment of Continuous Time Series , 2004, NIPS.

[10]  Vladimir Pavlovic,et al.  Discriminative Learning for Dynamic State Prediction , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Hans-Peter Seidel,et al.  An efficient algorithm for keyframe-based motion retrieval in the presence of temporal deformations , 2008, MIR '08.

[12]  Rui Li,et al.  Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[13]  Takeo Kanade,et al.  Detection, tracking, and classification of action units in facial expression , 2000, Robotics Auton. Syst..

[14]  M. Sternberg,et al.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons. , 1987, Journal of molecular biology.

[15]  Ying-li Tian,et al.  Evaluation of Face Resolution for Expression Analysis , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[16]  Bodo Rosenhahn,et al.  Multisensor-fusion for 3D full-body human motion capture , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[18]  Qingshan Liu,et al.  RankBoost with l1 regularization for facial expression recognition and intensity estimation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[20]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[21]  Cédric Notredame,et al.  Recent Evolutions of Multiple Sequence Alignment Algorithms , 2007, PLoS Comput. Biol..