Speech/Audio Signal Classification Using Spectral Flux Pattern Recognition

In this paper, we present a novel method for the improvement of speech and audio signal classification using spectral flux (SF) pattern recognition for the MPEG Unified Speech and Audio Coding (USAC) standard. For effective pattern recognition, the Gaussian mixture model (GMM)probability model is used. For the optimal GMM parameter extraction, we use the expectation maximization (EM)algorithm. The proposed classification algorithm is divided into two significant parts. The first one extracts the optimal parameters for the GMM. The second distinguishes between speech and audio signals using SF pattern recognition. The performance of the proposed classification algorithm shows better results compared to the conventionally implemented USAC scheme.