Mining Databases with Different Schemas: Integrating Incompatible Classifiers

Distributed data mining systems aim to discover (and combine) usefull information that is distributed across multiple databases. The JAM system, for example, applies machine learning algorithms to compute models over distributed data sets and employs meta-learning techniques to combine the multiple models. Occasionally, however, these models (or classifiers) are induced from databases that have (moderately) different schemas and hence are incompatible. In this paper, we investigate the problem of combining multiple models computed over distributed data sets with different schemas. Through experiments performed on actual credit card data provided by two different financial institutions, we evaluate the effectiveness of the proposed approaches and demonstrate their potential utility.