Mel-frequency cepstral coefficient analysis in speech recognition

Speech recognition is a major topic in speech signal processing. Speech recognition is considered as one of the most popular and reliable biometric technologies used in automatic personal identification systems. Speech recognition systems are used for variety of applications such as multimedia browsing tool, access centre, security and finance. It allows people work in active environment to use computer. For a reliable and high accuracy of speech recognition, simple and efficient representation methods are required. In this paper, the zero crossing extraction and the energy level detection are applied to the recorded speech signal for voiced/unvoiced area detection. The detected voiced signals are applied for segmentation. Further, the MFCC method is applied to all of the segmented windows. The extracted MFCC data are further used as inputs for neural network training.