Pseudo Complex Cepstrum Using Discrete Cosine Transform

Two new algorithms are proposed, which obtain pseudo complex cepstrum using Discrete Cosine Transform (DCT). We call this as the Discrete Cosine Transformed Cepstrum (DCTC). In the first algorithm, we apply the relation between Discrete Fourier Transform (DFT) and DCT. Computing the complex cepstrum using Fourier transform needs the unwrapped phase. The calculation of the unwrapped phase is difficult whenever multiple zeros and poles occur near or on the unit circle. Since DCT is a real function, its phase can only be 0 or π and the phase is unwrapped by representing the negative sign by exp (−jπ) and the positive sign by exp (j0) . The second algorithm obviates the need for DFT and obtains DCTC by representing the DCT sequence itself by magnitude and phase components. Phase is unwrapped in the same way as the first algorithm. We have tested DCTC on a simulated system that has multiple poles and zeros near or on the unit circle. The results show that DCTC matches the theoretical complex cepstrum more closely than the DFT based complex cepstrum. We have explored possible uses for DCTC in obtaining the pitch contour of syllables, words and sentences. It is shown that the spectral envelope obtained from the first few coefficients matches reasonably with the envelope of the signal spectrum under consideration, and thus can be used in applications, where faithful reproduction of the spectral envelope is not critical. We also examine the utility of DCTC as feature set for speaker identification. The identification rate with DCTC as feature vector was higher than that with linear prediction-derived cepstral coefficients.

[1]  P. Yip,et al.  Discrete Cosine Transform: Algorithms, Advantages, Applications , 1990 .

[2]  Stephen A. Martucci,et al.  Symmetric convolution and the discrete sine and cosine transforms , 1993, IEEE Trans. Signal Process..

[3]  David G. Stork,et al.  Pattern Classification , 1973 .

[4]  A. G. Ramakrishnan,et al.  Optimal Feature Extraction for Bilingual OCR , 2002, Document Analysis Systems.

[5]  H. Hassanein,et al.  On the use of discrete cosine transform in cepstral analysis , 1984 .

[6]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[7]  Leonard A. Smith,et al.  Distinguishing between low-dimensional dynamics and randomness in measured time series , 1992 .

[8]  D.P. Skinner,et al.  The cepstrum: A guide to processing , 1977, Proceedings of the IEEE.

[9]  A. Oppenheim Speech analysis-synthesis system based on homomorphic filtering. , 1969, The Journal of the Acoustical Society of America.

[10]  A. Oppenheim,et al.  Homomorphic analysis of speech , 1968 .

[11]  J. Bee Bednar,et al.  Calculating the complex cepstrum without phase unwrapping or integration , 1985, IEEE Trans. Acoust. Speech Signal Process..

[12]  Rangarao Muralishankar,et al.  Modification of pitch using DCT in the source domain , 2004, Speech Commun..

[13]  Rangarao Muralishankar,et al.  DCT based pseudo complex cepstrum , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  James C. Rogers,et al.  Time-domain cepstral transformations , 1993, IEEE Trans. Signal Process..

[15]  Md. Kamrul Hasan,et al.  Soft thresholding for DCT speech enhancement , 2002 .

[16]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[17]  Thomas F Quatieri,et al.  Phase estimation with application to speech analysis-synthesis , 1979 .

[18]  A. G. Ramakrishnan,et al.  ECG coding by wavelet-based linear prediction , 1997, IEEE Transactions on Biomedical Engineering.

[19]  A. G. Ramakrishnan,et al.  Machine Recognition of Printed Kannada Text , 2002, Document Analysis Systems.

[20]  Rangarao Muralishankar,et al.  Warped-LP residual resampling using DCT for pitch modification , 2002, INTERSPEECH.