Discrete representation of perceptual structure underlying consonant confusions.

The perceptual representation of speech is generally assumed to be discrete rather than continuous, pointing to the need for general discrete analytic models to represent observed perceptual similarities among speech sounds. The INDCLUS (INdividual Differences CLUStering) model and algorithm [J.D. Carroll and P. Arabie, Psychometrika 48, 157-169 (1983)] can provide this generality, representing symmetric three-way similarity data (stimuli X stimuli X conditions) as an additive combination of overlapping, and generally not hierarchial, clusters whose weights (which are numerical values gauging the importance of the clusters) vary both as a function of the cluster and condition being considered. INDCLUS was used to obtain a discrete representation of underlying perceptual structure in the Miller and Nicely consonant confusion data [G.A. Miller and P.E. Nicely, J. Acoust. Soc. Am. 27, 338-352 (1955)]. A 14-cluster solution accounted for 82.9% of total variance across the 17 listening conditions. The cluster composition and the variations in cluster weights as a function of stimulus degradation were interpreted in terms of the common and unique perceptual attributes of the consonants within each cluster. Low-pass filtering and noise masking selectively degraded unique attributes, especially the cues for place of articulation, while high-pass filtering degraded both unique and common attributes. The clustering results revealed that perceptual similarities among consonants are accurately modeled by additive combinations of their specific and discrete acoustic attributes whose weights are determined by the nature of the stimulus degradation.