kNN estimation of the unilateral dependency measure between random variables

The informational energy (IE) can be interpreted as a measure of average certainty. In previous work, we have introduced a non-parametric asymptotically unbiased and consistent estimator of the IE. Our method was based on the kth nearest neighbor (kNN) method, and it can be applied to both continuous and discrete spaces, meaning that we can use it both in classification and regression algorithms. Based on the IE, we have introduced a unilateral dependency measure between random variables. In the present paper, we show how to estimate this unilateral dependency measure from an available sample set of discrete or continuous variables, using the kNN and the naïve histogram estimators. We experimentally compare the two estimators. Then, in a real-world application, we apply the kNN and the histogram estimators to approximate the unilateral dependency between random variables which describe the temperatures of sensors placed in a refrigerating room.

[1]  R. Andonie,et al.  Energy generalized LVQ with relevance factors , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[2]  Jacob Goldberger,et al.  ICA based on a Smooth Estimation of the Differential Entropy , 2008, NIPS.

[3]  Angel Cataron,et al.  Informational Energy Kernel for LVQ , 2005, ICANN.

[4]  Angel Caţaron,et al.  Asymptotically Unbiased Estimator of the Informational Energy with kNN , 2013, Int. J. Comput. Commun. Control.

[5]  Charles F. Manski Analog Estimation Methods in Econometrics: Chapman & Hall/CRC Monographs on Statistics & Applied Probability , 1988 .

[6]  Bernard W. Silverman,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[7]  Angel Cataron,et al.  An informational energy LVQ approach for feature ranking , 2004, ESANN.

[8]  R. Andonie,et al.  How to infer the informational energy from small datasets , 2012, 2012 13th International Conference on Optimization of Electrical and Electronic Equipment (OPTIM).

[9]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[10]  DAVID G. KENDALL,et al.  Introduction to Mathematical Statistics , 1947, Nature.

[11]  Angel Cataron,et al.  Energy Supervised Relevance Neural Gas for Feature Ranking , 2010, Neural Processing Letters.

[12]  Haye Hinrichsen,et al.  Entropy estimates of small data sets , 2008, 0804.4561.

[13]  Septimiu Nechifor,et al.  Predictive analytics based on CEP for logistic of sensitive goods , 2014, 2014 International Conference on Optimization of Electrical and Electronic Equipment (OPTIM).

[14]  Silviu Guiaşu,et al.  Information theory with applications , 1977 .

[15]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.