论文信息 - Incorporating Phase Information for Source Separation via Spectrogram Factorization

Incorporating Phase Information for Source Separation via Spectrogram Factorization

Spectrogram factorization methods have been proposed for single channel source separation and audio analysis. Typically, the mixture signal is first converted into a time-frequency representation such as the short-time Fourier transform (STFT). The phase information is thrown away and this spectrogram matrix is then factored into the sum of rank-one source spectrograms. This approach incorrectly assumes the mixture spectrogram is the sum of the source spectrograms. In fact, the mixture spectrogram depends on the phase of the source STFTs. We investigate the consequences of this common assumption and introduce an approach that leverages a probabilistic representation of phase to improve the separation results.

Irfan A. Essa | R. Mitchell Parry | Irfan Essa | R. M. Parry

[1] Mark D. Plumbley,et al. INVESTIGATING SINGLE-CHANNEL AUDIO SOURCE SEPARATION METHODS BASED ON NON-NEGATIVE MATRIX FACTORIZATION , 2006 .

[2] Brendan J. Frey,et al. Probabilistic Inference of Speech Signals from Phaseless Spectrograms , 2003, NIPS.

[3] Mark D. Plumbley,et al. Polyphonic music transcription by non-negative sparse coding of power spectra , 2004 .

[4] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[5] Mark D. Plumbley,et al. Polyphonic transcription by non-negative sparse coding of power spectra , 2004, ISMIR.

[6] Barak A. Pearlmutter,et al. Convolutive Non-Negative Matrix Factorisation with a Sparseness Constraint , 2006 .

[7] Tuomas Virtanen,et al. Separation of sound sources by convolutive sparse coding , 2004, SAPA@INTERSPEECH.

[8] Aapo Hyvärinen,et al. Independent Component Analysis: Fast ICA by a fixed-point algorithm that maximizes non-Gaussianity , 2001 .

[9] Michael A. Casey,et al. Separation of Mixed Audio Sources By Independent Subspace Analysis , 2000, ICMC.

[10] Derry Fitzgerald,et al. SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION , 2002 .