Discovering Topical Structures of Databases Professor :

In today’s enterprise world, the scale of the databases and the increasing complexity of these databases and the prevalent lack of documentation make it hard for a data architect to understand, reverse engineer and integrate the databases. In this paper, the problem of discovering topical structures of databases to support semantic browsing and large scale data integration is addressed. The iDisc approach, a novel multi-strategy discovery framework and a novel clustering aggregation technique is proposed in this paper. Starting with a formal definition of the paper, the iDisc approach is explained in detail with the various algorithms for each module.