Di-Learn: Distributed Knowledge Discovery with Human Interaction

Analyzing data is one of the problems that still continue to offer new challenges. The advent of network-based distributed computing environments has made it possible to handle this problem up to some level. Today, mostly centralized algorithms are being used for data modeling and knowledge discovery, which are based on statistical and machine learning algorithms. Many of these algorithms require collection of data from distributed sites. Although, it is possible to store a huge amount of data in some data storages, these data are only useful if the analysts can extract beneficial information in form of patterns or rules. This problem is covered by techniques from the area of Knowledge Discovery in Databases. The KDD process is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data flj. The process includes several steps, which are invoked and parameterized in an interactive and iterative manner. Existing knowledge discovery and machine learning algorithms are based on a rule-set or a rule-base, sometimes this rule information is expanded as the time passes and algorithm processes more data. However, this expansion process is limited because of the variety of the knowledge and initial status of the implementation. In this project, we have implemented a distributed knowledge discovery software called as, Di-Learn, offering a new approach to this problem: human interaction.