Computational auditory scene analysis based on residue-driven architecture and its application to mixed speech recognition