Affective Pattern Classification

We develop a method for recognizing the emotional state of a person who is deliberately expressing one of eight emotions. Four physiological signals were measured and six features of each of these signals were extracted. We investigated three methods for the recognition: (1) Sequential floating forward search (SFFS) feature selection with K-nearest neighbors classification, (2) Fisher projection on structured subsets features with MAP classification, and (3) A hybrid SFFS-Fisher projection method. Each method was evaluated on the full set of eight emotions as well as on several subsets. The SFFS attained the highest rate for a trio of emotions, 2.7 times that of random guessing, while the Fisher projection with structured subsets attained the best performance on the full set of emotions, 3.9 times random. The emotion recognition problem is demonstrated to be a difficult one, with day-to-day variations within the same class often exceeding between-class variations on the same day. We present a way to take account of the day information, resulting in an improvement to the Fisher-based methods. The findings in this paper demonstrate that there is significant information in physiological signals for classifying the affective state of a person who is deliberately expressing a small set of emotions. Introduction This paper addresses emotion recognition, specifically the recognition by computer of affective information expressed by people. This is part of a larger effort in "affective computing," computing that relates to, arises fl’om, or deliberately influences emotions (Picard 1997). Affective computing has numerous applications and motivations, one of which is giving computers the skills involved in so-called "emotional intelligence," such as the ability to recognize a person’s emotions. Such skills have been argued to be more important in general than mathematical and verbal abilities in determining a person’s success in life (Goleman 1995). Recognition of emotional information is a key step toward giving computers the ability to interact more naturally and intelligently with people. The research described here focuses on recognition of emotional states during deliberate emotional expression by an actress. The actress, trained in guided imagery, used the Clynes method of sentic cycles to assist in eliciting the emotional states (Clynes 1977). For example, to elicit the state of "Neutral," (no emotion) she focused on a blank piece of paper or a typewriter. To elicit the state of "Anger" she focused on people who aroused anger in her. This process was adapted for the eight states: Neutral (no emotion) (N), Anger (A), Hate (H), Grief (G), Platonic Love (P), Love (L), Joy (J), and Reverence The specific states one would want a computer to recognize will depend on the particular application. The eight emotions used in this research are intended to be representative of a broad range, which can be described in terms of the "arousal-valence" space commonly used by psychologists (Lang 1995). The arousal axis ranges from calm and peaceful to active and excited, while the valence axis ranges from negative to positive. For example, anger was considered high in arousal, while reverence was considered low. Love was considered positive, while hate was considered negative. There has been prior work on emotional expression recognition from speech and from image and video; this work, like ours, has focused on deliberately expressed emotions. The problem is a hard one when you look at the few benchmarks which exist. In general, people can recognize affect in neutral-content speech with about 60% accuracy, choosing from among about six different affective states (Scherer 1981). Computer algorithms can match this accuracy but only under more restrictive assumptions, such as when the sentence content is known. Facial expression recognition is easier, and the rates computers obtain are higher: from 80-98% accuracy when recognizing 5-7 classes of emotional expression on groups of 8-32 people (Yacoob & Davis 1996; Essa & Pentland 1997). Facial expressions are easily controlled by people, and easily exaggerated, facilitating their discrimination. Emotion recognition can also involve other modali176 From: AAAI Technical Report FS-98-03. Compilation copyright © 1998, AAAI (www.aaai.org). All rights reserved.