Sequential Organization and Room Reverberation for Speech Segregation

Abstract : Inspired by the perceptual account of auditory scene analysis, significant advances were made in speech segregation in recent years. Despite these advances, two major challenges remained: sequential organization and room reverberation. This project aimed to address these two challenges. Substantial progress has been made along the following directions. First a tandem algorithm was developed that performs pitch tracking and voiced speech segregation iteratively. Second, a multipitch tracking algorithm was proposed for noisy and reverberant speech, which was then used in a novel, supervised learning approach to segregation of voiced speech in reverberant environments. Third, a method was suggested for unvoiced speech segregation by first removing voiced speech and periodic components, and then grouping unvoiced speech segments through analyzing their spectral characteristics. Two algorithms were proposed for sequential organization, an unsupervised clustering algorithm applicable to monaural recordings and a binaural algorithm that integrates monaural and binaural analyses. In addition, speech intelligibility tests were conducted and their results firmly establish the effectiveness of binary masking for improving human speech recognition in noisy backgrounds.