Information Width

Kolmogorov argued that the concept of information exists also in problems with no underlying stochastic model (as Shannon’s information representation) for instance, the information contained in an algorithm or in the genome. He introduced a combinatorial notion of entropy and information I(x : y) conveyed by a binary string x about the unknown value of a variable y. The current paper poses the following questions: what is the relationship between the information conveyed by x about y to the description complexity of x ? is there a notion of cost of information ? are there limits on how efficient x conveys information ? To answer these questions Kolmogorov’s definition is extended and a new concept termed information width which is similar to n-widths in approximation theory is introduced. Information of any input source, e.g., samplebased, general side-information or a hybrid of both can be evaluated by a single common formula. An application to the space of binary functions is considered.

[1]  Svante Janson,et al.  Random graphs , 2000, Wiley-Interscience series in discrete mathematics and optimization.

[2]  Colin Cooper,et al.  The vapnik-chervonenkis dimension of a random graph , 1995, Discret. Math..

[3]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[4]  Joel Ratsaby,et al.  VC-Dimensions of Random Function Classes , 2008, Discret. Math. Theor. Comput. Sci..

[5]  David Haussler,et al.  Epsilon-nets and simplex range queries , 1986, SCG '86.

[6]  A. Pinkus n-Widths in Approximation Theory , 1985 .

[7]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.

[8]  Gábor Lugosi,et al.  Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[9]  Thomas M. Cover,et al.  A convergent gambling estimate of the entropy of English , 1978, IEEE Trans. Inf. Theory.

[10]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[11]  Joel Ratsaby,et al.  On the Combinatorial Representation of Information , 2006, COCOON.

[12]  Béla Bollobás,et al.  Random Graphs , 1985 .

[13]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[14]  Yishay Mansour,et al.  Learning Boolean Functions via the Fourier Transform , 1994 .

[15]  János Pach,et al.  Combinatorial Geometry , 2012 .

[16]  Joel Ratsaby,et al.  The Degree of Approximation of Sets in Euclidean Space Using Sets with Bounded Vapnik-Chervonenkis Dimension , 1998, Discret. Appl. Math..

[17]  Avrim Blum Learning boolean functions in an infinite attribute space , 1990, STOC '90.

[18]  A. Shiryayev On Tables of Random Numbers , 1993 .

[19]  Balas K. Natarajan,et al.  On learning Boolean functions , 1987, STOC.

[20]  B. Bollobás Combinatorics: Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability , 1986 .

[21]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[22]  Zoltán Füredi,et al.  Color critical hypergraphs and forbidden configurations , 2005 .

[23]  David Haussler,et al.  ɛ-nets and simplex range queries , 1987, Discret. Comput. Geom..

[24]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[25]  Jack W. Szostak,et al.  Functional information: Molecular messages , 2003, Nature.

[26]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[27]  D. Pollard Convergence of stochastic processes , 1984 .

[28]  Joel Ratsaby,et al.  On the Value of Partial Information for Learning from Examples , 1997, J. Complex..

[29]  Ioannis Kontoyiannis The complexity and entropy of literary styles , 1997 .