Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition

In this paper we introduce a template-based method for recognizing human actions called action MACH. Our approach is based on a maximum average correlation height (MACH) filter. A common limitation of template-based methods is their inability to generate a single template using a collection of examples. MACH is capable of capturing intra-class variability by synthesizing a single Action MACH filter for a given action class. We generalize the traditional MACH filter to video (3D spatiotemporal volume), and vector valued data. By analyzing the response of the filter in the frequency domain, we avoid the high computational cost commonly incurred in template-based approaches. Vector valued data is analyzed using the Clifford Fourier transform, a generalization of the Fourier transform intended for both scalar and vector-valued data. Finally, we perform an extensive set of experiments and compare our method with some of the most recent approaches in the field by using publicly available datasets, and two new annotated human action datasets which include actions performed in classic feature films and sports broadcast television.

[1]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[2]  R. Nelson,et al.  Low level recognition of human motion (or how to get your man without finding his body parts) , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[3]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[4]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Tien-Hsin Chao,et al.  MACH filter synthesizing for detecting targets in cluttered environment for grayscale optical correlator , 1999, Defense, Security, and Sensing.

[6]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[7]  David J. Fleet,et al.  Robustly Estimating Changes in Image Appearance , 2000, Comput. Vis. Image Underst..

[8]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[10]  Yuan Qi,et al.  Fully automatic upper facial action recognition , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[11]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[12]  Abhijit Mahalanobis,et al.  Performance evaluation of quadratic correlation filters for target detection and discrimination in infrared imagery , 2004 .

[13]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[14]  Gwen Littlewort,et al.  Recognizing facial expression: machine learning and application to spontaneous behavior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Avinash C. Kak,et al.  Robust motion estimation under varying illumination , 2005, Image Vis. Comput..

[16]  Gerik Scheuermann,et al.  Clifford Fourier transform on vector fields , 2005, IEEE Transactions on Visualization and Computer Graphics.

[17]  Eli Shechtman,et al.  Space-time behavior based correlation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  B. V. K. Vijaya Kumar,et al.  Palmprint Classification Using Multiple Advanced Correlation Filters and Palm-Specific Segmentation , 2007, IEEE Transactions on Information Forensics and Security.

[20]  Mubarak Shah,et al.  Spatio–Temporal Regularity Flow (SPREF): Its Estimation and Applications , 2007, IEEE Transactions on Circuits and Systems for Video Technology.