Probability Distribution on Full Rooted Trees

The recursive and hierarchical structure of full rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. In most of these cases, the full rooted tree is not a random variable; as such, model selection to avoid overfitting becomes problematic. A method to solve this problem is to assume a prior distribution on the full rooted trees. This enables the optimal model selection based on the Bayes decision theory. For example, by assigning a low prior probability to a complex model, the maximum a posteriori estimator prevents the selection of the complex one. Furthermore, we can average all the models weighted by their posteriors. In this paper, we propose a probability distribution on a set of full rooted trees. Its parametric representation is suitable for calculating the properties of our distribution using recursive functions, such as the mode, expectation, and posterior distribution. Although such distributions have been proposed in previous studies, they are only applicable to specific applications. Therefore, we extract their mathematically essential components and derive new generalized methods to calculate the expectation, posterior distribution, etc.

[1]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[2]  Shigeichi Hirasawa,et al.  Reducing the space complexity of a Bayes coding algorithm using an expanded context tree , 2009, 2009 IEEE International Symposium on Information Theory.

[3]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[4]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[5]  Toshiyasu Matsushima,et al.  A Stochastic Model for Block Segmentation of Images Based on the Quadtree and the Bayes Code for It , 2021, Entropy.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Shigeichi Hirasawa,et al.  A class of distortionless codes designed by Bayes decision theory , 1991, IEEE Trans. Inf. Theory.

[8]  Ioannis Kontoyiannis,et al.  Revisiting Context-Tree Weighting for Bayesian Inference , 2021, 2021 IEEE International Symposium on Information Theory (ISIT).

[9]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[10]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  T. Matsushima,et al.  A Bayes coding algorithm using context tree , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[12]  Toshiyasu Matsushima,et al.  Meta-Tree Random Forest: Probabilistic Data-Generative Model and Bayes Optimal Prediction , 2021, Entropy.