Neural-net classification for spatio-temporal descriptor based depression analysis

Depression is a severe psychiatric disorder. Despite the high prevalence, current clinical practice depends almost exclusively on self-report and clinical opinion, risking a range of subjective biases. This paper focuses on depression analysis based on visual cues from facial expressions and upper body movements. The proposed diagnostic support system is based on computing spatio-temporal features from video sequences. Space Time Interest Points are computed for the videos for analysing the upper body movements and a temporal visual words dictionary is learned from them. Intra-facial muscle movement is captured by computing a LBP-TOP based codebook. Various neural-net classifiers are explored and compared with a SVM. The approach is evaluated on real-world clinical data from interactive interviews with depressed and healthy subjects.

[1]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[2]  H. Ellgring Nonverbal communication in depression , 1989 .

[3]  D. Mitchell Wilkes,et al.  Analysis of fundamental frequency for near term suicidal risk assessment , 2000, Smc 2000 conference proceedings. 2000 ieee international conference on systems, man and cybernetics. 'cybernetics evolving to systems, humans, organizations, and their complex interactions' (cat. no.0.

[4]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[5]  Elliot Moore,et al.  Critical Analysis of the Impact of Glottal Features in the Classification of Clinical Depression in Speech , 2008, IEEE Transactions on Biomedical Engineering.

[6]  Fernando De la Torre,et al.  Detecting depression from facial actions and vocal prosody , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[7]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[8]  Alan D. Lopez,et al.  The global burden of disease: a comprehensive assessment of mortality and disability from diseases injuries and risk factors in 1990 and projected to 2020. , 1996 .

[9]  Roland Göcke,et al.  Learning AAM fitting through simulation , 2009, Pattern Recognition.

[10]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ling Shao,et al.  Human Action Recognition Using LBP-TOP as Sparse Spatio-Temporal Feature Descriptor , 2009, CAIP.

[12]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[13]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[14]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[16]  Roland Göcke,et al.  An approach for automatically measuring facial activity in depressed subjects , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[17]  K. Krishnan,et al.  MRI-defined vascular depression. , 1997, The American journal of psychiatry.

[18]  J. Cohn,et al.  Deciphering the Enigmatic Face , 2005, Psychological science.

[19]  M. Møller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1990 .

[20]  Bin Hu,et al.  User-centered depression prevention: An EEG approach to pervasive healthcare , 2011, 2011 5th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth) and Workshops.