Identifying the distinctive acoustic cues of Mandarin tones

Using mathematical modeling, this study aims to characterize distinctive acoustic features of Mandarin tones based on a corpus of 1013 monosyllabic words produced by 21 native Mandarin speakers. For each tone, 22 acoustic cues were extracted. Besides standard F0, duration, and intensity measures, further cues were determined by fitting two mathematical models to the pitch contours. The first is a broken-line model, which models the contour as a continuous curve consisting of two lines with a single breakpoint. The second model is a parabola, which gives three parameters: a mean F0, an F0 slope, and an F0 curvature. Using Cohen’s d, we identify which of the 22 cues are important for distinguishing each tone from the others for all speakers, as well as identifying cues that are used idiosyncratically by particular speakers. Although the specific cues that best characterize each tone differ, we show that the three cues obtained by fitting a parabola to the tone contour are an effective small set of cues such that any pair of tones is well distinguished by at least one of them. We propose using these three cues as a canonical choice for defining tone characteristics.Using mathematical modeling, this study aims to characterize distinctive acoustic features of Mandarin tones based on a corpus of 1013 monosyllabic words produced by 21 native Mandarin speakers. For each tone, 22 acoustic cues were extracted. Besides standard F0, duration, and intensity measures, further cues were determined by fitting two mathematical models to the pitch contours. The first is a broken-line model, which models the contour as a continuous curve consisting of two lines with a single breakpoint. The second model is a parabola, which gives three parameters: a mean F0, an F0 slope, and an F0 curvature. Using Cohen’s d, we identify which of the 22 cues are important for distinguishing each tone from the others for all speakers, as well as identifying cues that are used idiosyncratically by particular speakers. Although the specific cues that best characterize each tone differ, we show that the three cues obtained by fitting a parabola to the tone contour are an effective small set of cues such...