A statistical approach to assessing speech and voice variability in speaker verification

Voice and speech parameters for a single speaker vary widely over different contexts, in particular in situations in which speakers are affected by stress or emotion or in which speech styles are used strategically. This high degree of intra-speaker variability presents a major challenge for speaker verification systems. Based on a large-scale study in which different kinds of affective states were induced in over 100 speakers from three language groups, we use a statistical approach to identify speech and voice parameters that are likely to strongly vary as a function of the respective situation and affective state as well as those that tend to remain relatively stable. In addition, we evaluate the latter with respect to their potential to differentiate individual speakers.