Domain-Driven Data Mining: Methodologies and Applications

The aims and objectives of data mining is to discover actionable knowledge of main interest to real user needs, which is one of Grand Challenges in KDD. Most extant data mining is a data-driven trial-an-error process. Patterns discovered via predefined models in the above process are often of limited interest to constraint-based real business. In order to work out patterns really interesting and actionable to the real world, pattern discovery is more likely to be a domain-driven human-machine-cooperated process. This talk proposes a practical data mining methodology named “domain-driven data mining”. The main ideas include a Domain-Driven In-Depth Pattern Discovery framework (DDID-PD), constraint-based mining, in-depth mining, human-cooperated mining and loop-closed mining. Guided by this methodology, we demonstrate some of our work in identifying useful correlations in real stock markets, for instance, discovering optimal trading rules from the existing rule classes, and mining trading rule-stock correlations in stock exchange data. The results have attracted strong interest from both traders and researchers in stock markets. It has shown that the methodology is potential for guiding deep mining of patterns interesting to real business.