Image Classification Via LZ78 Based String Kernel: A Comparative Study

Normalized Information Distance (NID) [1] is a general-purpose similarity metric based on the concept of Kolmogorov Complexity. We have developed this notion into a valid kernel distance, called LZ78-based string kernel [2] and have shown that it can be used effectively for a variety of 1D sequence classification tasks [3]. In this paper, we further demonstrate its applicability on 2D images. We report experiments with our technique on two real datasets: (i) a collection of real-life photographs and (ii) a collection of medical diagnostic images from Magnetic Resonance (MR) data. The classification results are compared with those of the original similarity metric (i.e. NID) and several conventional classification algorithms. In all cases, the proposed kernel approach demonstrates better or equivalent performance when compared with other candidate methods but with lower computational overhead.

[1]  Ming Li,et al.  A robust approach to sequence classification , 2005, 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'05).

[2]  Bin Ma,et al.  The similarity metric , 2001, IEEE Transactions on Information Theory.

[3]  David J. Harper,et al.  Using compression based language models for text categorization. , 2003 .

[4]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[5]  Vittorio Loreto,et al.  Language trees and zipping. , 2002, Physical review letters.

[6]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[7]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[8]  J. Andrew Bangham,et al.  Morphological scale-space preserving transforms in many dimensions , 1996, J. Electronic Imaging.

[9]  Ming Li,et al.  An LZ78 Based String Kernel , 2005, ADMA.

[10]  Ian H. Witten,et al.  Data Compression Using Adaptive Coding and Partial String Matching , 1984, IEEE Trans. Commun..

[11]  Yuxuan Lan,et al.  Image classification using compression distance , 2005, VVG.

[12]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[13]  Abraham Lempel,et al.  Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.