Analyse du comportement humain à partir de la vidéo en étudiant l'orientation du mouvement

La reconnaissance du comportement et la prediction des activites des personnes depuis la video sont des preoccupations majeures dans le domaine de la vision par ordinateur. L'objectif principal de mon travail de these est de proposer des algorithmes qui permettent d'analyser des objets en mouvement a partir de la video pour extraire des comportements humains. Cette analyse est effectuee dans des environnements interieurs ou exterieurs filmes par des simples webcams ou par des cameras plus sophistiquee. La scene analysee peut etre de deux types en fonction du nombre de personnes presentes. On distingue les scenes de foule ou le nombre de personnes est important. Dans ce type de scene, nous nous interessons aux problemes de la detection d'evenements de foule, a l'analyse des flux et a l'extraction des motifs de mouvement. Le deuxieme type de scene se caracterise par la presence d'une seule personne a la fois dans le champ de la camera. Elle est appelee scene individuelle. Nous y traitons le probleme de reconnaissance d'actions humaines. Pour atteindre ces objectifs, nous proposons une approche basee sur trois niveaux d'analyse. Le premier est l'extraction des caracteristiques de bas niveau recuperes les images constituant un flux video (ex. les zones en mouvement). Le deuxieme construit des descripteurs pour l'analyse du comportement humain (ex. la direction et la vitesse de mouvement moyennes). Le niveau le plus haut se sert des descripteurs de l'etape intermediaire afin de fournir aux utilisateurs des resultats concrets sur l'analyse du comportement humain (ex. telle personne marche, une autre court, etc.). Des experimentations sur des benchmarks connus ont valide nos approches, avec un positionnement tres interessant par rapport a l'etat de l'art.

[1]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[2]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[3]  Bill Triggs,et al.  Detecting Keypoints with Stable Position, Orientation, and Scale under Illumination Changes , 2004, ECCV.

[4]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[5]  Paul A. Viola,et al.  Unsupervised improvement of visual detectors using cotraining , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Stuart J. Russell,et al.  Image Segmentation in Video Sequences: A Probabilistic Approach , 1997, UAI.

[7]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[8]  Christopher W. Geib,et al.  The meaning of action: a review on action recognition and mapping , 2007, Adv. Robotics.

[9]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[10]  Luigi Cinque,et al.  A Statistical Method for People Counting in Crowded Environments , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[11]  Avinash C. Kak,et al.  Interactive Learning of a Multiple-Attribute Hash Table Classifier for Fast Object Recognition , 1995, Comput. Vis. Image Underst..

[12]  John K. Tsotsos,et al.  Detecting Motion Patterns via Direction Maps with Application to Surveillance , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[13]  Chao Chen,et al.  Using Random Forest to Learn Imbalanced Data , 2004 .

[14]  Senem Velipasalar,et al.  Automatic Counting of Interacting People by using a Single Uncalibrated Camera , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[15]  Fatih Porikli,et al.  Human Body Tracking by Adaptive Background Models and Mean-Shift Analysis , 2003 .

[16]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Thierry Bouwmans,et al.  Comparison of Background Subtraction Methods for a Multimedia Learning Space , 2016, SIGMAP.

[18]  Tianzhu Zhang,et al.  Learning semantic scene models by object classification and trajectory clustering , 2009, CVPR.

[19]  Chabane Djeraba,et al.  Motion Pattern Extraction and Event Detection for Automatic Visual Surveillance , 2011, EURASIP J. Image Video Process..

[20]  Václav Hlavác,et al.  Pose primitive based human action recognition in videos or still images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Louahdi Khoudour,et al.  A People Counting System Based on Dense and Close Stereovision , 2008, ICISP.

[22]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[23]  M. Sigari,et al.  Fuzzy Running Average and Fuzzy Background Subtraction: Concepts and Application , 2008 .

[24]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Mubarak Shah,et al.  Detecting global motion patterns in complex videos , 2008, 2008 19th International Conference on Pattern Recognition.

[26]  Thierry Bouwmans,et al.  Background Modeling using Mixture of Gaussians for Foreground Detection - A Survey , 2008 .

[27]  Feng Chen,et al.  A Fast and Robust People Counting Method in Video Surveillance , 2007, 2007 International Conference on Computational Intelligence and Security (CIS 2007).

[28]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[29]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30]  Shireen Elhabian,et al.  Moving Object Detection in Spatial Domain using Background Removal Techniques - State-of-Art , 2008 .

[31]  David Beymer,et al.  Person counting using stereo , 2000, Proceedings Workshop on Human Motion.

[32]  Sergio A. Velastin,et al.  Motion-based machine vision techniques for the management of large crowds , 1999, ICECS'99. Proceedings of ICECS '99. 6th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.99EX357).

[33]  Chandrika Kamath,et al.  Robust Background Subtraction with Foreground Validation for Urban Traffic Video , 2005, EURASIP J. Adv. Signal Process..

[34]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[35]  Steven S. Beauchemin,et al.  The computation of optical flow , 1995, CSUR.

[36]  Dubravko Culibrk,et al.  K-means based segmentation for real-time zenithal people counting , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[37]  Paul Rybski,et al.  Analysis of a Spatio-Temporal Clustering Algorithm for Counting People in a Meeting , 2006 .

[38]  A. Gardel,et al.  Real Time Head Detection for Embedded Vision Modules , 2007, 2007 IEEE International Symposium on Intelligent Signal Processing.

[39]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[40]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[41]  Min Chen,et al.  Video Semantic Event/Concept Detection Using a Subspace-Based Multimedia Data Mining Framework , 2008, IEEE Transactions on Multimedia.

[42]  Chabane Djeraba,et al.  Action Recognition Using Direction Models of Motion , 2010, 2010 20th International Conference on Pattern Recognition.

[43]  David A. Forsyth,et al.  Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis , 2005, Found. Trends Comput. Graph. Vis..

[44]  David Suter,et al.  A Novel Robust Statistical Method for Background Initialization and Visual Surveillance , 2006, ACCV.

[45]  James J. Little,et al.  Simultaneous Tracking and Action Recognition using the PCA-HOG Descriptor , 2006, The 3rd Canadian Conference on Computer and Robot Vision (CRV'06).

[46]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[47]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[48]  Tarak Gandhi,et al.  Pedestrian Protection Systems: Issues, Survey, and Challenges , 2007, IEEE Transactions on Intelligent Transportation Systems.

[49]  Mubarak Shah,et al.  Chaotic invariants of Lagrangian particle trajectories for anomaly detection in crowded scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[50]  Jie Yu,et al.  A Review and Comparison of Measures for Automatic Video Surveillance Systems , 2008, EURASIP J. Image Video Process..

[51]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[52]  Christopher Joseph Pal,et al.  Activity recognition using the velocity histories of tracked keypoints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[53]  Jing Shen,et al.  Moving Human Head Detection for Automatic Passenger Counting System , 2012 .

[54]  Ivan Laptev,et al.  Velocity adaptation of space-time interest points , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[55]  David Gerónimo Gómez,et al.  Survey of Pedestrian Detection for Advanced Driver Assistance Systems , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Ramakant Nevatia,et al.  Tracking multiple humans in complex situations , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Chabane Djeraba,et al.  Reconnaissance d'actions par modélisation du mouvement , 2011, EGC.

[59]  Leonidas J. Guibas,et al.  Counting people in crowds with a real-time network of simple image sensors , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[60]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[61]  Oncel Tuzel,et al.  Bayesian background modeling for foreground detection , 2005, VSSN@MM.

[62]  Robert B. Fisher,et al.  Hidden Markov Models for Optical Flow Analysis in Crowds , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[63]  Norbert Brändle,et al.  Pedestrian Detection and Tracking for Counting Applications in Crowded Situations , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[64]  Nuno Vasconcelos,et al.  Analysis of Crowded Scenes using Holistic Properties , 2009 .

[65]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[66]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[67]  Takeo Kanade,et al.  Introduction to the Special Section on Video Surveillance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  Juan Carlos Niebles,et al.  Spatial-Temporal correlatons for unsupervised action classification , 2008, 2008 IEEE Workshop on Motion and video Computing.

[69]  L. Li,et al.  On pixel count based crowd density estimation for visual surveillance , 2004, IEEE Conference on Cybernetics and Intelligent Systems, 2004..

[70]  Yassine Benabbas,et al.  Multi-Modal User Interactions in Controlled Environments , 2010 .

[71]  Soraia Raupp Musse,et al.  VhCVE: A Collaborative Virtual Environment Including Facial Animation and Computer Vision , 2009, 2009 VIII Brazilian Symposium on Games and Digital Entertainment.

[72]  Maurice Milgram,et al.  A novel approach for recognition of human actions with semi-global features , 2008, Machine Vision and Applications.

[73]  Kenji Terada,et al.  A counting method of the number of passing people using a stereo camera , 1999, IECON'99. Conference Proceedings. 25th Annual Conference of the IEEE Industrial Electronics Society (Cat. No.99CH37029).

[74]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[75]  Stefano Messelodi,et al.  A Kalman Filter Based Background Updating Algorithm Robust to Sharp Illumination Changes , 2005, ICIAP.

[76]  Dahua Lin,et al.  Learning visual flows: A Lie algebraic approach , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Chabane Djeraba,et al.  Analyse spatiotemporelle des vecteurs de mouvement : application au comptage des personnes , 2011, EGC.

[78]  Emmanuel Dellandréa,et al.  A People Counting System Based on Face Detection and Tracking in a Video , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[79]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[80]  Xiaoping Chen,et al.  A robust method for detecting and counting people , 2008, 2008 International Conference on Audio, Language and Image Processing.

[81]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[82]  Peter H. Tu,et al.  Detecting and counting people in surveillance applications , 2005, IEEE Conference on Advanced Video and Signal Based Surveillance, 2005..

[83]  Takeo Kanade,et al.  Tracking in unstructured crowded scenes , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[84]  Ben J. A. Kröse,et al.  Head Detection in Stereo Data for People Counting and Segmentation , 2011, VISAPP.

[85]  Massimo Piccardi,et al.  Background subtraction techniques: a review , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[86]  Tobias Scheffer,et al.  Using Transduction and Multi-view Learning to Answer Emails , 2003, PKDD.

[87]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[88]  P. Anandan,et al.  A computational framework and an algorithm for the measurement of visual motion , 1987, International Journal of Computer Vision.

[89]  L. Kratz,et al.  Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[90]  Tieniu Tan,et al.  A system for learning statistical motion patterns , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  M. Torres-Torriti,et al.  Effective Pedestrian Detection and Counting at Bus Stops , 2008, 2008 IEEE Latin American Robotic Symposium.

[92]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[93]  Chabane Djeraba,et al.  Spatio-Temporal Optical Flow Analysis for People Counting , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[94]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[95]  Chabane Djeraba,et al.  Extraction de la région d'intérêt d'une personne sur un obstacle , 2010, EGC.

[96]  A F Bobick,et al.  Movement, activity and action: the role of knowledge in the perception of motion. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[97]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[98]  Chabane Djeraba,et al.  Real-time crowd motion analysis , 2008, 2008 19th International Conference on Pattern Recognition.

[99]  Chabane Djeraba,et al.  Human Action Recognition using Direction and Magnitude Models of Motion , 2011, VISAPP.

[100]  F BobickAaron,et al.  The Recognition of Human Movement Using Temporal Templates , 2001 .

[101]  Tianzhu Zhang,et al.  Learning semantic scene models by object classification and trajectory clustering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[102]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[103]  Yandong Tang,et al.  Flow mosaicking: Real-time pedestrian counting without scene-specific learning , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[104]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[105]  John K. Tsotsos,et al.  Detecting motion patterns via direction maps with application to surveillance , 2009, Comput. Vis. Image Underst..

[106]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[107]  Ehud Rivlin,et al.  Robust Real-Time Unusual Event Detection using Multiple Fixed-Location Monitors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[108]  Chabane Djeraba,et al.  Analyse globale du flux optique pour la détection d'évènements dans une scène de foule , 2010, EGC.

[109]  Sergio A. Velastin,et al.  Crowd monitoring using image processing , 1995 .

[110]  M. Thonnat,et al.  Video understanding for metro surveillance , 2004, IEEE International Conference on Networking, Sensing and Control, 2004.

[111]  Gérard G. Medioni,et al.  Motion pattern interpretation and detection for tracking moving vehicles in airborne video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[112]  Mubarak Shah,et al.  Abnormal crowd behavior detection using social force model , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[113]  Yong Wang A New Approach to Fitting Linear Models in High Dimensional Spaces , 2000 .

[114]  Fernando Boto,et al.  Real-Time People Counting Using Multiple Lines , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[115]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[116]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[117]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[118]  Ying-Hong Liang,et al.  A Rapid Method for Passing People Counting in Monocular Video Sequences , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[119]  G. Jansson,et al.  Perceiving events and objects , 2013 .

[120]  Mubarak Shah,et al.  Learning motion patterns in crowded scenes using motion flow field , 2008, 2008 19th International Conference on Pattern Recognition.

[121]  Ákos Utasi,et al.  Statistical filters for crowd image analysis , 2009 .

[122]  Mohan M. Trivedi,et al.  A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[123]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[124]  Mubarak Shah,et al.  Learning object motion patterns for anomaly detection and improved object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[126]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[127]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[128]  Luc Van Gool,et al.  An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector , 2008, ECCV.

[129]  Tieniu Tan,et al.  A survey on visual surveillance of object motion and behaviors , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[130]  Mubarak Shah,et al.  Learning semantic features for action recognition via diffusion maps , 2012, Comput. Vis. Image Underst..

[131]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[132]  Sheng-Fuu Lin,et al.  Estimation of number of people in crowded scenes using perspective transformation , 2001, IEEE Trans. Syst. Man Cybern. Part A.