Exploring abilities of merged normalized forward-backward correlation for speech pitch analysis

The article deals with usage of time-domain merged normalized forward-backward correlation (MNFBC) for pitch estimation of speech signals. This method should prevent from shortcomings of other methods commonly used in pitch detection algorithms (PDA). The text also presents comparison of possible improvements for voicing decision stage of MNFBC and also puts mind to final fundamental frequency (F0) smoothing with Viterbi algorithm. The precision and voiced-unvoiced (VUV) decision was compared against pitch reference database (part of Spanish Speecon). Results show that F0 estimate precision of MNFBC in connection with Viterbi smoothing using cents conversion in transition probability function is comparable to PRAAT cross-correlation. Although with additional signal energy thresholding unvoiced errors for close-talk channel 0 are lowered, the results are still better in PRAAT algorithm, but the difference gets even for channel 1 (lavaliere microphone). Noise robustness of the algorithm could be improved by pre-ordering a noise reduction block.