A memory efficient graph kernel

In this paper, we show how learning models generated by a recently introduced state-of-the-art kernel for graphs can be optimized from the point of view of memory occupancy. After a brief description of the kernel, we introduce a novel representation of the explicit feature space of the kernel based on an hash function which allows to reduce the amount of memory needed both during the training phase and to represent the final learned model. Subsequently, we study the application of a feature selection strategy based on the F-score to further reduce the number of features in the final model. On two representative datasets involving binary classification of chemical graphs, we show that it is actually possible to sensibly reduce memory occupancy (up to one order of magnitude) for the final model with a moderate loss in classification performance.