Combining inductive and deductive tools for data analysis

In this paper we propose the combined use of different methods to improve the data analysis process. This is obtained by combining inductive and deductive techniques. We also use different inductive techniques such as clustering algorithms, to derive data partition, and decision trees induction, characterizing classes in terms of logical rules. Inductive techniques are used for generating hypotheses from data whereas deductive techniques are used to derive knowledge and to verify hypotheses. In order to guide users in the analysis process, we have developed a system which integrates deductive tools and data mining tools such as classification algorithms, features selection algorithms, visualization tools and tools to manipulate data sets easily. The system developed is currently used in a large project whose aim is the integration of information sources containing data concerning the sociodeconomic aspects of Calabria and its subsequent analysis. Several experiments on the sociodeconomic data have shown that the combined use of different techniques improves both the comprehensibility and the accuracy of models.

[1]  J. Ross Quinlan,et al.  Generating Production Rules from Decision Trees , 1987, IJCAI.

[2]  Matthew Self,et al.  Bayesian Classification , 1988, AAAI.

[3]  Peter Cheeseman,et al.  Bayesian classification theory , 1991 .

[4]  Robin Hanson,et al.  Bayesian Classification with Correlation and Inheritance , 1991, IJCAI.

[5]  Carlo Zaniolo,et al.  Using Metagueries to Integrate Inductive Learning and Deductive Database Technology , 1994, KDD Workshop.

[6]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[7]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[8]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[9]  Michael J. Rothman,et al.  Applying Data Mining Techniques to a Health Insurance Information System , 1996, VLDB.

[10]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[11]  Evangelos Simoudis,et al.  Integrating Inductive and Deductive Reasoning for Data Mining , 1996, Advances in Knowledge Discovery and Data Mining.

[12]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[13]  Tobias Scheffer,et al.  Unbiased assessment of learning algorithms , 1997, IJCAI 1997.

[14]  Heikki Mannila,et al.  Methods and Problems in Data Mining , 1997, ICDT.

[15]  Giuseppe Psaila,et al.  A tightly-coupled architecture for data mining , 1998, Proceedings 14th International Conference on Data Engineering.

[16]  Shalom Tsur,et al.  Integrating Data Mining with Relational DBMS: A Tightly-Coupled Approach , 1999, NGITS.

[17]  Sergio Greco,et al.  Combining Different Data Mining Techniques to Improve Data Analysis , 2000, FQAS.

[18]  Sergio Greco,et al.  A Hybrid Technique for Data Mining on Balance-Sheet Data , 2000, DaWaK.