Data mining with graphical models

Data Mining, or Knowledge Discovery in Databases, is a fairly young research area that has emerged as a reply to the flood of data we are faced with nowadays. It tries to meet the challenge to develop methods that can help human beings to discover useful patterns in their data. One of these techniques -- and definitely one of the most important, because it can be used for such frequent data mining tasks like classifier construction and dependence analysis -- is learning graphical models from datasets of sample cases. In this paper we review the ideas underlying graphical models, with a special emphasis on the less well known possibilistic networks. We discuss the main principles of learning graphical models from data and consider briefly some algorithms that have been proposed for this task as well as data preprocessing methods and evaluation measures.

[1]  Prakash P. Shenoy,et al.  Valuation-based systems: a framework for managing uncertainty in expert systems , 1992 .

[2]  Mill Johannes G.A. Van,et al.  Transmission Of Information , 1961 .

[3]  Rudolf Kruse,et al.  The context model: An integrating view of vagueness and uncertainty , 1993, Int. J. Approx. Reason..

[4]  Rudolf Kruse,et al.  Learning Possibilistic Networks from Data , 1995, AISTATS.

[5]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[6]  José Manuel Gutiérrez,et al.  Expert Systems and Probabiistic Network Models , 1996 .

[7]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[8]  R. Hartley Transmission of information , 1928 .

[9]  David Heckerman,et al.  Probabilistic similarity networks , 1991, Networks.

[10]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[11]  Didier Dubois,et al.  Fuzzy information engineering: a guided tour of applications , 1997 .

[12]  Alessandro Saffiotti,et al.  Pulcinella: A General Tool for Propagating Uncertainty in Valuation Networks , 1991, UAI.

[13]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[14]  Frank Klawonn,et al.  Foundations of fuzzy systems , 1994 .

[15]  Eric Bauer,et al.  Update Rules for Parameter Estimation in Bayesian Networks , 1997, UAI.

[16]  Trevor P Martin,et al.  Fril- Fuzzy and Evidential Reasoning in Artificial Intelligence , 1995 .

[17]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[18]  Christian Borgelt,et al.  Graphical models - methods for data analysis and mining , 2002 .

[19]  Jörg Gebhardt,et al.  Learning from data: possibilistic graphical models , 2000 .

[20]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[21]  V. Isham An Introduction to Spatial Point Processes and Markov Random Fields , 1981 .

[22]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[23]  L. Zadeh,et al.  Fuzzy Logic for the Management of Uncertainty , 1992 .

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .