The integration of synthetic faces and text-to-speech voice synthesis (what we call "talking heads") allows new applications in the area of man-machine interfaces. In the near future, talking heads might be useful communicative interface agents. But before making an extensive use of talking heads, several issues have to be checked according to their acceptability by users. An important issue is to make sure that the used synthetic voices match their faces. The scope of this paper is to study the coherence that might exist between synthetic voices and faces. Twenty-four subjects rated the coherence of all the combinations between ten faces and six voices. The main results of this paper show that not all associations between faces and voices are relevant and that some associations are better rated than others according to qualitative criteria.
[1]
Jonas Beskow,et al.
Developing and evaluating conversational agents
,
2001
.
[2]
Joakim Gustafson,et al.
The August Spoken Dialogue System
,
1999
.
[3]
Hao Yan,et al.
More than just a pretty face: affordances of embodiment
,
2000,
IUI '00.
[4]
Mark Steedman,et al.
Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents
,
1994,
SIGGRAPH.
[5]
Christophe d'Alessandro,et al.
MODELLING THE EMOTIONAL QUALITY OF SPEECH IN A TELECOMMUNICATION CONTEXT
,
2002
.
[6]
Akikazu Takeuchi,et al.
Communicative facial displays as a new conversational modality
,
1993,
INTERCHI.