An improved vector quantization method using deep neural network

Abstract To address the challenging problem of vector quantization (VQ) for high dimensional vector using large coding bits, this work proposes a novel deep neural network (DNN) based VQ method. This method uses a k-means based vector quantizer as an encoder and a DNN as a decoder. The decoder is initialized by the decoder network of deep auto-encoder, fed with the codes provided by the k-means based vector quantizer, and trained to minimize the coding error of VQ system. Experiments on speech spectrogram coding demonstrate that, compared with the k-means based method and a recently introduced DNN-based method, the proposed method significantly reduces the coding error. Furthermore, in the experiments of coding multi-frame speech spectrogram, the proposed method achieves about 11% relative gain over the k-means based method in terms of segmental signal to noise ratio (SegSNR).

[1]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[2]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[3]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[4]  Geoffrey E. Hinton,et al.  Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[5]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[6]  Yoshua Bengio,et al.  Convergence Properties of the K-Means Algorithms , 1994, NIPS.

[7]  Geoffrey E. Hinton,et al.  Binary coding of speech spectrograms using a deep auto-encoder , 2010, INTERSPEECH.

[8]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[9]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[10]  Bo Hong,et al.  An Efficient k-Means Algorithm on CUDA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[11]  D. Sculley,et al.  Web-scale k-means clustering , 2010, WWW '10.

[12]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[13]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[14]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15]  Laura E. Boucheron,et al.  Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Chin-Hui Lee,et al.  Deep learning vector quantization for acoustic information retrieval , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[18]  François Capman,et al.  New Nato Stanag Narrow Band Voice Coder at 600 Bits/s , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.