Perceptually-motivated assessment of automatically detected lexical stress in L2 learners' speech

This paper presents a method of automatic lexical stress assessment for L2 English speech. Syllable stress can be labeled at three levels - primary (P), secondary (S) and no (N) stress, but secondary stress may vary among word pronunciations within and across accents and present difficulties for human perception. Hence, evaluation of lexical stress based on all three levels (i.e., the P-S-N criterion which requires that all syllables in a word must be correctly classified in terms of stress) may be too strict, and we may consider relaxing it to either the P-N or A-P-N criterion - the former only requires the correct placement of primary stress, while the latter relaxes further to allow for confusion between primary and secondary stress. An automatic syllable stress detector is applied to L2 learners' speech. Its output for all the syllables in a word is evaluated in terms of the P-S-N, P-N or A-P-N criterion. Comparisons between automatic and manual assessments of lexical stress patterns suggests that the A-P-N criterion can strike a good balance between accommodating variability and screening out problematic patterns, giving an average word accuracy of 79.6%.