Estimation of the Ideal Binary Mask using Directional Systems

The ideal binary mask is often seen as a goal for time-frequencymasking algorithms trying to increase speech intelligibility, but therequired availability of the unmixed signals makes it difficult to calculatethe ideal binary mask in any real-life applications. In thispaper we derive the theory and the requirements to enable calculationsof the ideal binary mask using a directional system without theavailability of the unmixed signals. The proposed method has a lowcomplexity and is verified using computer simulation in both idealand non-ideal setups showing promising results.Index Terms— Time-Frequency Masking, Directional systems,Ideal Binary Mask, Speech Intelligibility, Sound separation

[1]  DeLiang Wang,et al.  Speech segregation based on sound localization , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[2]  Gary W. Elko,et al.  Superdirectional microphone arrays , 2000 .

[3]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[4]  Scott Rickard,et al.  Blind separation of speech mixtures via time-frequency masking , 2004, IEEE Transactions on Signal Processing.

[5]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[6]  Dorothea Kolossa,et al.  Nonlinear Postprocessing for Blind Speech Separation , 2004, ICA.

[7]  DeLiang Wang,et al.  On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.

[8]  Lauren Calandruccio,et al.  Determination of the Potential Benefit of Time-Frequency Gain Manipulation , 2006, Ear and hearing.

[9]  DeLiang Wang,et al.  Two-Microphone Separation of Speech Mixtures , 2008, IEEE Transactions on Neural Networks.

[10]  DeLiang Wang,et al.  Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. , 2006, The Journal of the Acoustical Society of America.

[11]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[12]  P. Loizou,et al.  Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction. , 2008, The Journal of the Acoustical Society of America.