Supporting Relational Knowledge Discovery: Lessons in Architecture and Algorithm Design

This paper discusses a few of the lessons we have learned d eveloping a relational knowledge discovery system. The relationships among data instances in relational data provide e xtra information for “mining.” This additional information has the potential to greatly improve the quality of learned models. However, the dependencies among instances in the data a lso introduce new statistical challenges for learning algorithms. Relational data provide a n ideal environment i n which to examine a central challenge of knowledge discovery ‐ its “chicken and egg” character. Data representation can impair the a bility to learn important knowledge, but knowing the “right” data representation often requires just that knowledge. With relational data, representation is often a c hoice; many alternate views of the data provide a bundant fodder for r easoning about transformations. In light of this, we discuss representation and d esign choices that support a co-evolutionary process of knowledge discovery and data transformation in relation data.