Nonlinear minimum mean square error estimator for mixture-maximisation approximation

In many speech separation, enhancement, and recognition techniques, it is necessary to express the log spectrum of a mixture speech signal in terms of the log spectra of the underlying speech signals. For this purpose, the mixture-maximisation (MIXMAX) approximation is commonly used. Presented is a proof for this approximation in a statistical framework. It is concluded that this approximation is a nonlinear minimum mean square error estimator with the assumption of uniform distributions for phase information of the underlying speech signals.

[1]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[2]  Michael Picheny,et al.  Speech recognition using noise-adaptive prototypes , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Harald Pobloth,et al.  Squared error as a measure of perceived phase distortion. , 2003, The Journal of the Acoustical Society of America.