TF-IDF based binary fingerprint search with vector quantization error compensation

In this paper, we propose a TF-IDF based binary fingerprint search scheme which compensates the vector quantization error. Our fingerprinting scheme purposes on supporting the search for a large scale of multimedia data, with tens or hundreds billions of fingerprint. We consider sets of index candidates to compensate the quantization error. In the proposed scheme, index data and fingerprint data are managed differently for a large scale of content search. Index data is managed with in-memory storage while fingerprint data is managed with the traditional DB and in-memory cache.

[1]  Seungjae Lee,et al.  Audio fingerprinting based on normalized spectral subband centroids , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  Milos Ilic,et al.  Inverted index search in data mining , 2014, 2014 22nd Telecommunications Forum Telfor (TELFOR).

[3]  ZhangWen,et al.  A comparative study of TF*IDF, LSI and multi-words for text classification , 2011 .