EFFICIENT METHODS FOR CALCULATING MAXIMUM ENTROPY DISTRIBUTIONS

We present a new algorithm for computing the maximum entropy probability distribution satisfying a set of constraints. Unlike previous approaches, our method is integrated with the planning of data collection and tabulation. We show how adding constraints and performing the association additional tabulations can substantially speed up computation by replacing the usual iterative techniques with a straight-forward computation. We note, however, that the constraints added may contain significantly more variables than any of the original constraints so there may not be enough data to collect meaningful statistics. These extra constraints are shown to correspond to the intermediate tables in Cheeseman''s method. Furthermore, we prove that acyclic hypergraphs and decomposable models are equivalent, and discuss the similarities and differences between our algorithm and Spiegelhalter''s algorithm. Finally, we compare our work to Kim and Pear''s work on singly-connected networks. Portions of this thesis are joint work with Ronald Rivest.