Neural networks for estimating articulatory positions from speech

This talk describes an application of neural networks for estimating positions of various articulators (such as the tongue, the lips, etc.) in the vocal tract from the speech signal. In general, a neural network consists of a large number of interconnected computational elements. The networks that will be discussed in this paper include an input layer of nodes connected directly or through an intermediate layer of hidden nodes to an output layer. Iterative gradient search procedures are often used for determining the unknown parameters of the neural network, but these procedures are very slow for training neural networks with a large number of hidden nodes. For estimating articulator positions, it was found that the weights in the first layer could be set to fixed random values during the training procedure without degrading the performance of the network. The random fixed weights in the first layer permit the use of a fast noniterative procedure for determining the unknown parameters of the second layer....