Human Attention Detection Using AM-FM Representations

Human activity detection from digital videos presents many challenges to the computer vision and image processing communities. Recently, many methods have been developed to detect human activities with varying degree of success. Yet, the general human activity detection problem remains very challenging, especially when the methods need to work “in the wild” (e.g., without having precise control over the imaging geometry). The thesis explores phase-based solutions for (i) detecting faces, (ii) back of the heads, (iii) joint detection of faces and back of the heads, and (iv) whether the head is looking to the left or the right, using standard video cameras without any control on the imaging geometry. The proposed phase-based approach is based on the development of simple and robust methods that relie on the use of Amplitude Modulation Frequency Modulation (AM-FM) models. The approach is validated using video frames extracted from the Advancing Outof-school Learning in Mathematics and Engineering (AOLME) project. The dataset consisted of 13,265 images from ten students looking at the camera, and 6,122 images

[1]  Andrew Zisserman,et al.  Domain Adaptation for Upper Body Pose Tracking in Signed TV Broadcasts , 2013, BMVC.

[2]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Marios S. Pattichis,et al.  Multiscale Amplitude-Modulation Frequency-Modulation (AM–FM) Texture Analysis of Ultrasound Images of the Intima and Media Layers of the Carotid Artery , 2011, IEEE Transactions on Information Technology in Biomedicine.

[4]  Marios S. Pattichis,et al.  A Multiscale Optimization Approach to Detect Exudates in the Macula , 2014, IEEE Journal of Biomedical and Health Informatics.

[5]  Marios S. Pattichis,et al.  Multidimensional orthogonal FM transforms , 2001, IEEE Trans. Image Process..

[6]  Sen-Ching S. Cheung,et al.  Human pose estimation using two RGB-D sensors , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[7]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[8]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[9]  Ben Taskar,et al.  MODEC: Multimodal Decomposable Models for Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Wen Gao,et al.  Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[11]  Marios S. Pattichis,et al.  Multiscale Amplitude-Modulation Frequency-Modulation (AM–FM) Texture Analysis of Multiple Sclerosis in Brain MRI Images , 2011, IEEE Transactions on Information Technology in Biomedicine.

[12]  Shiguang Shan,et al.  Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  N. A. Abdul Rahim,et al.  RGB-H-CbCr skin colour model for human face detection , 2006 .

[15]  Marios S. Pattichis,et al.  AM-FM texture segmentation in electron microscopic muscle imaging , 1999, IEEE Transactions on Medical Imaging.

[16]  Rabia Jafri,et al.  A Survey of Face Recognition Techniques , 2009, J. Inf. Process. Syst..

[17]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[18]  Andrew Zisserman,et al.  Personalizing Human Video Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[20]  Yong Man Ro,et al.  Collaborative facial color feature learning of multiple color spaces for face recognition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[21]  Marios S. Pattichis,et al.  Multiscale AM-FM Methods for Diabetic Retinopathy Lesion Detection , 2010, IEEE Transactions on Medical Imaging.

[22]  Ahror Belaid,et al.  Phase based level set segmentation of ultrasound images , 2009, 2009 9th International Conference on Information Technology and Applications in Biomedicine.

[23]  Mario Fritz,et al.  Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Wilson S. Geisler,et al.  Multichannel Texture Analysis Using Localized Spatial Filters , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Marios S. Pattichis,et al.  Foveated video quality assessment , 2002, IEEE Trans. Multim..

[26]  Andrew Zisserman,et al.  Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos , 2014, ACCV.

[27]  Wei Liang,et al.  3D head pose estimation with convolutional neural network trained on synthetic images , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[28]  Marios S. Pattichis,et al.  Fingerprint classification using an AM-FM model , 2001, IEEE Trans. Image Process..

[29]  Marios S. Pattichis,et al.  COPERM: transform-domain energy compaction by optimal permutation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[30]  Marios S. Pattichis,et al.  Multi-scale AM-FM analysis for the classification of surface electromyographic signals , 2012, Biomed. Signal Process. Control..

[31]  Andrew Zisserman,et al.  Flowing ConvNets for Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Marios S. Pattichis,et al.  A multiscale decomposition approach to detect abnormal vasculature in the optic disc , 2015, Comput. Medical Imaging Graph..

[34]  Victor Murray,et al.  Automatic detection of diabetic retinopathy and age-related macular degeneration in digital fundus images. , 2011, Investigative ophthalmology & visual science.

[35]  Marios S. Pattichis,et al.  Foveated video compression with optimal rate control , 2001, IEEE Trans. Image Process..

[36]  Andrew Zisserman,et al.  Upper Body Pose Estimation with Temporal Sequential Forests , 2014, BMVC.

[37]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[38]  Marios S. Pattichis,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Analyzing Image Structure by Multidimensional Frequency Modulation Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2006 .

[39]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[40]  Chu-Song Chen,et al.  Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval , 2014, ECCV.

[41]  Langis Gagnon,et al.  Local Phase-context for Face Recognition under Varying Conditions , 2014, IHCI.

[42]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[43]  Tomas Pfister,et al.  Advancing human pose and gesture recognition , 2015 .

[44]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Marios S. Pattichis,et al.  AM-FM Texture Image Analysis of the Intima and Media Layers of the Carotid Artery , 2009, ICANN.

[46]  Marios S. Pattichis,et al.  Tree Image Growth Analysis Using Instantaneous Phase Modulation , 2011, EURASIP J. Adv. Signal Process..

[47]  Andrew Zisserman,et al.  Automatic and Efficient Human Pose Estimation for Sign Language Videos , 2013, International Journal of Computer Vision.

[48]  Marios S. Pattichis,et al.  Despeckle Filtering for Multiscale Amplitude-Modulation Frequency-Modulation (AM-FM) Texture Analysis of Ultrasound Images of the Intima-Media Complex , 2014, Int. J. Biomed. Imaging.

[49]  Marios S. Pattichis,et al.  Multiscale AM-FM Demodulation and Image Reconstruction Methods With Improved Accuracy , 2010, IEEE Transactions on Image Processing.