Learning from data: possibilistic graphical models

One of the major problems of handling imperfect information in knowledge-based systems is finding a computationally appealing description of the available data, which is both economical in using storage and supports efficient reasoning techniques. The existence of such a description strongly rests on whether dependencies among the data items are decomposable into local, more basic dependencies. The relational database model [Maier, 1983; Ullman, 1988; Ullman, 1989] has tackled this problem by storing a database in form of a lossless join decomposition, namely a collection of projections, from which the original relation can be reconstructed. Some recent work in this field concerns structure identification in relational data [Dechter and Pearl, 1992] and the clarification of cross-references to the solution of constraint-satisfaction problems [Gyssens et al., 1994]. Although many problems in artificial intelligence, database theory, graph theory, and operations research are connected with decomposition problems, very few general results have been established so far. In the field of graphical modeling and its widespread application areas in diagnostics, expert systems, KDD systems, planning systems, data analysis, and control, the most advanced deliverables refer to probabilistic graphical models [Whittaker, 1990; Castillo et al., 1997; Lauritzen, 1997]. In addition to efficient evidence propagation techniques [Pearl, 1986; Lauritzen and Spiegelhalter, 1988], there are many deliverables that refer to the induction of Bayesian networks or Markov networks from statistical data. For an overview, see [Buntine, 1994; Madigan and Raftery, 1994; Fisher and Lenz, 1996]. The corresponding algorithms are based on linearity and normality assumptions [Pearl and Wermuth, 19931, the extensive testing of conditional independence relations [Spines and Glymour, 1991; Verma and Pearl, 1992], or Bayesian approaches [Cooper and Herskovits, 1992; Lauritzen et al., 1993]. Some crucial problems of these methods concern their computational complexity, their limited reliability unless the amount of data is enormous, and strong presuppositions like the requirement of an ordering of the nodes and a priori distribution assumptions. In order to overcome the complexity problems some heuristic approaches like, for example, the K2 algorithm [Cooper and Herskovits, 1992] with respect to a appropriate quality measure [Geiger and Heckerman, 1995] or the fusion of conditional independence tests with Bayesian learning [Singh and Valtorta, 1993] have turned out to be unavoidable.

[1]  Luis Fariñas del Cerro,et al.  Possibility Theory and Independence , 1994, IPMU.

[2]  Jürg Kohlas,et al.  A Mathematical Theory of Hints , 1995 .

[3]  G. Matheron Random Sets and Integral Geometry , 1976 .

[4]  J. Kohlas,et al.  A Mathematical Theory of Hints: An Approach to the Dempster-Shafer Theory of Evidence , 1995 .

[5]  Prakash P. Shenoy,et al.  Local Computation in Hypertrees , 1991 .

[6]  Rudolf Kruse,et al.  The context model: An integrating view of vagueness and uncertainty , 1993, Int. J. Approx. Reason..

[7]  Ronald Fagin,et al.  Multivalued dependencies and a new normal form for relational databases , 1977, TODS.

[8]  Claude E. Shannon,et al.  The Mathematical Theory of Communication , 1950 .

[9]  H.-G. Leimer Triangulated Graphs with Marked Vertices , 1988 .

[10]  David Heckerman,et al.  An axiomatic framework for belief updates , 1986, UAI.

[11]  D. Dubois,et al.  Belief Revision: Belief change and possibility theory , 1992 .

[12]  T. Fine,et al.  Towards a Frequentist Theory of Upper and Lower Probability , 1982 .

[13]  John E. Shore,et al.  Relative Entropy, Probabilistic Inference, and AI , 1985, UAI.

[14]  M. Golumbic Algorithmic graph theory and perfect graphs , 1980 .

[15]  Judea Pearl Rejoinder to comments on "reasoning with belief functions: An analysis of compatibility" , 1992, Int. J. Approx. Reason..

[16]  L. Jonathan Cohen A Note on Inductive Logic , 1973 .

[17]  Prakash P. Shenoy,et al.  Conditional Independence in Uncertainty Theories , 1992, UAI 1992.

[18]  Judea Pearl,et al.  An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation , 1992, UAI.

[19]  Zoltan Domotor Probability Kinematics and Representation of Belief Change , 1980, Philosophy of Science.

[20]  Rudolf Kruse,et al.  Axiomatic Treatment of Possibilistic Independence , 1995, ECSQARU.

[21]  P. Fishburn The Axioms of Subjective Probability , 1986 .

[22]  Paul P. Wang Advances in Fuzzy Sets, Possibility Theory, and Applications , 1983 .

[23]  Lluis Godo,et al.  MILORD: The architecture and the management of linguistically expressed uncertainty , 1989, Int. J. Intell. Syst..

[24]  Marc Gyssens,et al.  Decomposing Constraint Satisfaction Problems Using Database Techniques , 1994, Artif. Intell..

[25]  David Heckerman,et al.  Probabilistic similarity networks , 1991, Networks.

[26]  Rina Dechter,et al.  Structure Identification in Relational Data , 1992, Artif. Intell..

[27]  David Maier,et al.  The Theory of Relational Databases , 1983 .

[28]  Henry E. Kyburg,et al.  Bayesian and Non-Bayesian Evidential Updating , 1987, Artificial Intelligence.

[29]  Piero P. Bonissone,et al.  RUM: A Layered Architecture for Reasoning with Uncertainty , 1987, IJCAI.

[30]  L. Zadeh Fuzzy sets as a basis for a theory of possibility , 1999 .

[31]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[32]  Philippe Smets,et al.  Resolving misunderstandings about belief functions , 1992, Int. J. Approx. Reason..

[33]  B. M. Hill,et al.  Theory of Probability , 1990 .

[34]  Frank Klawonn,et al.  Fuzzy control on the basis of equality relations with an example from idle speed control , 1995, IEEE Trans. Fuzzy Syst..

[35]  Rina Dechter Decomposing a Relation into a Tree of Binary Relations , 1990, J. Comput. Syst. Sci..

[36]  D. Rose Triangulated graphs and the elimination process , 1970 .

[37]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[38]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[39]  Lotfi A. Zadeh,et al.  The concept of a linguistic variable and its application to approximate reasoning-III , 1975, Inf. Sci..

[40]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[41]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[42]  T. Fine Theories of Probability: An Examination of Foundations , 1973 .

[43]  Philippe Smets Probability of Deductibility and Belief Functions , 1993, ECSQARU.

[44]  Prakash P. Shenoy,et al.  A valuation-based language for expert systems , 1989, Int. J. Approx. Reason..

[45]  Rudolf Kruse,et al.  Qualitative and Quantitative Practical Reasoning , 1997, Lecture Notes in Computer Science.

[46]  William Harper,et al.  Causation in decision, belief change, and statistics , 1988 .

[47]  James J. Buckley,et al.  A fuzzy expert system , 1986 .

[48]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[49]  George J. Klir,et al.  On the uniqueness of possibilistic measure of uncertainty and information , 1987 .

[50]  G. L. S. Shackle,et al.  Decision Order and Time in Human Affairs , 1962 .

[51]  David Lindley,et al.  The Probability Approach to the Treatment of Uncertainty in Artificial Intelligence and Expert Systems , 1987 .

[52]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[53]  R. T. Cox Probability, frequency and reasonable expectation , 1990 .

[54]  E. Shortliffe Computer-based medical consultations: mycin (elsevier north holland , 1976 .

[55]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[56]  Pascale Fonck Conditional Independence in Possibility Theory , 1994, UAI.

[57]  P. M. Williams Bayesian Conditionalisation and the Principle of Minimum Information , 1980, The British Journal for the Philosophy of Science.

[58]  Frank Klawonn,et al.  Foundations of fuzzy systems , 1994 .

[59]  R. Hartley Transmission of information , 1928 .

[60]  Ronald L. Iman,et al.  Rejoinder to comments , 1980 .

[61]  George J. Klir,et al.  Fuzzy sets, uncertainty and information , 1988 .

[62]  Prakash P. Shenoy,et al.  Axioms for probability and belief-function proagation , 1990, UAI.

[63]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[64]  Khaled Mellouli,et al.  Propagating belief functions in qualitative Markov trees , 1987, Int. J. Approx. Reason..

[65]  T. Speed,et al.  Decomposable graphs and hypergraphs , 1984, Journal of the Australian Mathematical Society. Series A. Pure Mathematics and Statistics.

[66]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[67]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[68]  Milan Studený Formal Properties of Conditional Independence in Different Calculi of AI , 1993, ECSQARU.

[69]  E. Ziegel,et al.  Artificial intelligence and statistics , 1986 .

[70]  Daniel Hunter,et al.  Graphoids and natural conditional functions , 1991, Int. J. Approx. Reason..

[71]  David Heckerman,et al.  Probabilistic Interpretation for MYCIN's Certainty Factors , 1990, UAI.

[72]  G. de Cooman,et al.  A new approach to possibilistic independence , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[73]  Rudolf Kruse,et al.  Uncertainty and vagueness in knowledge based systems: numerical methods , 1991, Artificial intelligence.

[74]  Philippe Smets,et al.  The Transferable Belief Model , 1994, Artif. Intell..

[75]  D. Dubois,et al.  Fuzzy sets in approximate reasoning, part 1: inference with possibility distributions , 1999 .

[76]  David Lindley Scoring rules and the inevitability of probability , 1982 .

[77]  Rudolf Kruse,et al.  Uncertainty and Vagueness in Knowledge Based Systems , 1991, Artificial Intelligence.

[78]  Moninder Singh,et al.  An Algorithm for the Construction of Bayesian Network Structures from Data , 1993, UAI.

[79]  Rudolf Kruse,et al.  A New Approach to Semantic Aspects of Possibilistic Reasoning , 1993, ECSQARU.

[80]  H. Heyer,et al.  Information and Sufficiency , 1982 .

[81]  Rudolf Kruse,et al.  Background and Perspectives of Possibilistic Graphical Models , 1997, ECSQARU-FAPR.

[82]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[83]  Prakash P. Shenoy,et al.  Valuation Networks and Conditional Independence , 1993, Conference on Uncertainty in Artificial Intelligence.

[84]  E. Hisdal Conditional possibilities independence and noninteraction , 1978 .

[85]  M. Gupta,et al.  FUZZY INFORMATION AND DECISION PROCESSES , 1981 .

[86]  Glenn Shafer,et al.  Readings in Uncertain Reasoning , 1990 .

[87]  A. Dempster Upper and Lower Probabilities Generated by a Random Closed Interval , 1968 .

[88]  V. Strassen,et al.  Me\fehler und Information , 1964 .

[89]  Didier Dubois,et al.  A semantics for possibility theory based on likelihoods , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[90]  Judea Pearl,et al.  When can association graphs admit a causal interpretation , 1994 .

[91]  Rudolf Kruse,et al.  Learning Possibilistic Networks from Data , 1995, AISTATS.

[92]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[93]  Hans-J. Lenz,et al.  Learning from Data - Fifth International Workshop on Artificial Intelligence and Statistics, AISTATS 1995, Key West, Florida, USA, January, 1995. Proceedings , 1996, AISTATS.

[94]  P. Cheeseman Probabilistic versus Fuzzy Reasoning , 1986 .

[95]  Didier Dubois,et al.  The logical view of conditioning and its application to possibility and evidence theories , 1990, Int. J. Approx. Reason..

[96]  Robert E. Tarjan,et al.  Simple Linear-Time Algorithms to Test Chordality of Graphs, Test Acyclicity of Hypergraphs, and Selectively Reduce Acyclic Hypergraphs , 1984, SIAM J. Comput..

[97]  Dov M. Gabbay,et al.  Handbook of defeasible reasoning and uncertainty management systems: volume 2: reasoning with actual and potential contradictions , 1998 .

[98]  J. Q. Smith,et al.  1. Bayesian Statistics 4 , 1993 .

[99]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[100]  Ren C. Luo,et al.  Multisensor integration and fusion for intelligent machines and systems , 1995 .

[101]  J. Besag,et al.  Bayesian image restoration, with two applications in spatial statistics , 1991 .

[102]  Luis M. de Campos,et al.  Updating Uncertain Information , 1990, IPMU.

[103]  David Heckerman,et al.  A Characterization of the Dirichlet Distribution with Application to Learning Bayesian Networks , 1995, UAI.

[104]  P. Walley Statistical Reasoning with Imprecise Probabilities , 1990 .

[105]  Didier Dubois,et al.  Modelling uncertainty and inductive inference: A survey of recent non-additive probability systems , 1988 .

[106]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[107]  Charles Elkan,et al.  The paradoxical success of fuzzy logic , 1993, IEEE Expert.

[108]  G. Klir,et al.  MEASURES OF UNCERTAINTY AND INFORMATION BASED ON POSSIBILITY DISTRIBUTIONS , 1982 .

[109]  Didier Dubois,et al.  Expressing Independence in a Possibilistic Framework and its Application to Default Reasoning , 1994, ECAI.

[110]  Wolfgang Spohn,et al.  A general non-probabilistic theory of inductive reasoning , 2013, UAI.

[111]  Rudolf Kruse,et al.  Parallel Combination of Information Sources , 1998 .

[112]  Andrew P. Sage,et al.  Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[113]  Wang Pei-zhuang From the Fuzzy Statistics to the Falling Random Subsets , 1983 .

[114]  Michael Clarke,et al.  Symbolic and Quantitative Approaches to Reasoning and Uncertainty , 1991, Lecture Notes in Computer Science.

[115]  Wolfgang Spohn,et al.  Ordinal Conditional Functions: A Dynamic Theory of Epistemic States , 1988 .

[116]  R. Fildes Journal of the Royal Statistical Society (B): Gary K. Grunwald, Adrian E. Raftery and Peter Guttorp, 1993, “Time series of continuous proportions”, 55, 103–116.☆ , 1993 .

[117]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[118]  V. Isham An Introduction to Spatial Point Processes and Markov Random Fields , 1981 .

[119]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[120]  Judea Pearl,et al.  Reasoning with belief functions: An analysis of compatibility , 1990, Int. J. Approx. Reason..

[121]  J. Q. Smith Influence Diagrams for Statistical Modelling , 1989 .

[122]  Edward H. Shortliffe,et al.  Computer-based medical consultations, MYCIN , 1976 .