Morden physiological analysis begins to involve more and more types of information. Electroencephalogram (EEG) signals as a typical example is starting to be analyzed with facial expressions videos to detect emotions. Emotions play an important role in the daily life of human beings, the need and importance of automatic emotion recognition has grown with increasing role of human computer interface applications. In this paper, we concentrate on recognition of the emotions jointly from "inner" and "outer" reactions, which are electroencephalogram (EEG) signals and facial expression video. Due to the streaming nature of this problem, the data volume and velocity is very challenging. We address these challenges from the theoretic perspective and propose a real time algorithm based on EEG signals and synchronized facial video to learn feature vector jointly. Our algorithm consists of an unsupervisedly EEG dictionary component based on deep learning theorem, and a probability pooling component transforms a continuous sequential signal into an EEG "sentence" which consists of a sequence of EEG words. The EEG sentence is then jointly learned with video features into a new fixed length feature representation for emotion classification. We overcome several computational challenges on the data based on the idea of convolution and pooling, and we conduct extensive evaluation for each component of our model. We also demonstrate the state-of-the-art classification result on real-world dataset. The superior performances on the emotion recognition task indicates that 1) the natural language scenario can be applied in EEG sequences and 2) borrowing video modality can increase the overall performance.
[1]
Quoc V. Le,et al.
Distributed Representations of Sentences and Documents
,
2014,
ICML.
[2]
Michael Elad,et al.
Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries
,
2006,
IEEE Transactions on Image Processing.
[3]
Thierry Pun,et al.
DEAP: A Database for Emotion Analysis ;Using Physiological Signals
,
2012,
IEEE Transactions on Affective Computing.
[4]
M. L. Eaton,et al.
The Non-Singularity of Generalized Sample Covariance Matrices
,
1973
.
[5]
M. Elad,et al.
$rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
,
2006,
IEEE Transactions on Signal Processing.
[6]
A. Bruckstein,et al.
K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation
,
2005
.
[7]
Yee Whye Teh,et al.
A Fast Learning Algorithm for Deep Belief Nets
,
2006,
Neural Computation.
[8]
W. Groß,et al.
Lehrbuch der Analysis
,
1915
.
[9]
Bernhard Schölkopf,et al.
Support vector channel selection in BCI
,
2004,
IEEE Transactions on Biomedical Engineering.
[10]
Y. C. Pati,et al.
Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition
,
1993,
Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.
[11]
H. Heuser.
Lehrbuch der Analysis
,
2022
.