Separation of several speakers recorded by two microphones (cocktail-party processing)

Abstract A signal processing method for enhancing the directional separation of an ordinary (dummy-head) stereophonic speech recording is described that, after initial adaptation to a certain direction, simulates the human ability to concentrate on speech coming from this direction and to suppress disturbing speakers from other directions. The method is derived from the principle of Adaptive Noise Cancelling; an FFT/Overlap-Add realization of the adaptive filter is chosen and short-time power estimates are used for its determination. In tests with up to four speakers, clear improvements of SNR and intelligibility of the desired speaker were obtained.