Non-parametric estimation and correction of non-linear distortion in speech systems
暂无分享,去创建一个
The performance of speech systems such as speaker recognition degrades drastically when there is mismatch between training and testing conditions, caused by non-linear distortion. This paper describes a technique to estimate and correct such non-linear distortion in speech. The focus is on constrained restoration of degraded speech, that is distortion in the test speech is undone relative to the training speech. Restoration is a two step process-estimation followed by inversion. The non-linearity is estimated in the form of a look-up table by a process of statistical matching using a reference speech template. This statistical matching technique provides a very good estimate of the true non-linear characteristic, and the process is robust, computationally efficient, and universally applicable. Speaker-ID experiments, using artificially corrupted test speech, showed significant improvement in performance after the test speech was 'cleaned' using this technique. The restoration process itself does not introduce appreciable distortion.
[1] Winson Taam,et al. Nonlinear System Analysis and Identification From Random Data , 1991 .
[2] M.G. Bellanger,et al. Digital processing of speech signals , 1980, Proceedings of the IEEE.
[3] F. A. Seiler,et al. Numerical Recipes in C: The Art of Scientific Computing , 1989 .