Stochastic Information Granules Extraction for Graph Embedding and Classification

Graphs are data structures able to efficiently describe real-world systems and, as such, have been extensively used in recent years by many branches of science, including machine learning engineering. However, the design of efficient graph-based pattern recognition systems is bottlenecked by the intrinsic problem of how to properly match two graphs. In this paper, we investigate a granular computing approach for the design of a general purpose graph-based classification system. The overall framework relies on the extraction of meaningful pivotal substructures on the top of which an embedding space can be build and in which the classification can be performed without limitations. Due to its importance, we address whether information can be preserved by performing stochastic extraction on the training data instead of performing an exhaustive extraction procedure which is likely to be unfeasible for large datasets. Tests on benchmark datasets show that stochastic extraction can lead to a meaningful set of pivotal substructures with a much lower memory footprint and overall computational burden, making the proposed strategies suitable also for dealing with big datasets.