Automatic identification of hierarchy in multivariate data

Given n variables to model, symbolic regression (SR) returns a flat list of n equations. As the number of state variables to be modeled scales, it becomes increasingly difficult to interpret such a list. Here we present a symbolic regression method that detects and captures hidden hierarchy in a given system. The method returns the equations in a hierarchical dependency graph, which increases the interpretability of the results. We demonstrate two variations of this hierarchical modeling approach, and show that both consistently outperform non-hierarchical symbolic regression on a number of synthetic data sets.

[1]  R. Ulanowicz,et al.  Information theoretical analysis of the aggregation and hierarchical structure of ecological networks , 1985 .

[2]  R. Sommer,et al.  Homology and the hierarchy of biological systems. , 2008, BioEssays : news and reviews in molecular, cellular and developmental biology.

[3]  Nikolay V. Dokholyan,et al.  Hierarchy in social organization , 2003 .

[4]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.