Mathematical evidence of the acoustic universal structure in speech

The paper shows mathematically that there exists an acoustic universal structure in speech, which can be interpreted as a physical implementation of structural phonology. The structure has completely no dimensions of multiplicative and linear transformational distortions, which are inevitably involved in speech communication as differences of vocal tract shape, gender, age, microphone, room, line, hearing characteristics, and so on. A speech event, such as a phone, is probabilistically modeled as a distribution of parameters calculated by a linear transformation of a log spectrum, e.g., cepstrum. A set of events, such as a word, is relatively captured as structure composed of the distributions. An n-point structure is uniquely determined by fixing the lengths of its /sub n/C/sub 2/ diagonal lines, namely, the distance matrix among the n points. The distance between two distributions is calculated as a Bhattacharyya distance. The resulting structure has very interesting characteristics. Multiplicative and linear transformational distortions are geometrically interpreted as shift and rotation of the structure, respectively. This fact implies that there always exists a distortion-free communication channel between a speaker and a listener.