Quantifying Total Influence between Variables with Information Theoretic and Machine Learning Techniques

The increasingly sophisticated investigation of complex systems requires more robust estimates of the correlations between the measured quantities. The traditional Pearson Correlation Coefficient is easy to calculate but is sensitive only to linear correlations. The total influence between quantities is therefore often expressed in terms of the Mutual Information, which takes into account also the nonlinear effects but is not normalised. To compare data from different experiments, the Information Quality Ratio is therefore in many cases of easier interpretation. On the other hand, both Mutual Information and Information Quality Ratio are always positive and therefore cannot provide information about the sign of the influence between quantities. Moreover, they require an accurate determination of the probability distribution functions of the variables involved. Since the quality and amount of data available is not always sufficient, to grant an accurate estimation of the probability distribution functions, it has been investigated whether neural computational tools can help and complement the aforementioned indicators. Specific encoders and auto encoders have been developed for the task of determining the total correlation between quantities, including the sign of their mutual influence. Both their accuracy and computational efficiencies have been addressed in detail, with extensive numerical tests using synthetic data. The first applications to experimental databases are very encouraging.