Efficient Multi-relational Classification by Tuple ID Propagation

Most of today’s structured data is stored in relational databases. In contrast, most classification approaches only apply on single “flat” data relations. And it is usually difficult to convert multiple relations into a single flat relation without losing essential information. Inductive Logic Programming approaches have proven effective with high accuracy in multi-relational classification. Unfortunately, they usually suffer from poor scalability with respect to the number of relations and the number of attributes in the database. In this paper we propose CrossMine, an efficient and scalable approach for multirelational classification. It uses a novel method tuple ID propagation to perform virtual joins, so as to achieve high classification accuracy and high efficiency on databases with complex schemas. We present experimental results on two real datasets to show the performance of our approach.