An unsupervised, sequential learning algorithm for the segmentation of speech waveforms with multiple speakers
暂无分享,去创建一个
The authors present a method for segmenting speech waveforms containing several speakers into utterances, each from one individual, and then identifying each utterance as coming from a specific individual or group of individuals. The procedure is unsupervised in that there is no training set, and sequential in that information obtained in early stages of the process is utilized in later stages.<<ETX>>
[1] J. Cohen. Segmenting speech using dynamic programming. , 1981, The Journal of the Acoustical Society of America.
[2] Herbert Gish,et al. Segregation of speakers for speech recognition and speaker identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[3] Damaris M. Ayuso,et al. Gisting conversational speech , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.