SPEECH ENHANCEMENT OF MULTIPLE MOVING SOURCES BASED ON SUBBAND CLUSTERING TIME-DELAY ESTIMATION

A new robust blind microphone array method to enhance speech signals generated by multiple moving sources in a noisy environment is presented. This approach is based on a two-stage scheme. A subband clustering time-delay estimation algorithm is first used to localize the dominant speech sources. The speech enhancement is performed in a second stage, based on the acquired spatial information, by means of a soft-constrained subband beamformer. The robustness of this structure is ensured by the spatial constraint constructed to include the discrepancies in the acoustical environment model as well as errors in the time-delay estimation. Such scheme also allows for an efficient adaptation of the beamformer to speakers movement. The proposed subband clustering approach for time-delay estimation exploits the sparseness of speech signals in the time-frequency domain to localize multiple speakers simultaneously. It also provides means to select the number of target sources. Evaluation in a real environment with moving speakers shows promising results.