Information and Organic Molecules: Structure Considerations via Integer Statistics

Information in relation to organic molecules was investigated in a previous work (Graham and Schacht, J. Chem. Inf. Comput. Sci. 2000, 40, 187). The topic is given further consideration here with the help of integer statistics. Discussed are the ramifications of an integer variable Omega(t) which quantifies the total number of binding complexions for an organic molecule. Offered is a statistical view of the maximum allowed number of independent regions D expressed by the molecule, dependent on Omega(t). We illustrate the distribution properties of D along with upper limit estimates of the regioinformation mu, also dependent on Omega(t). Integer statistics based on elementary number theory establish the key distribution properties of D and mu. In so doing, the traits distinguishing high regioinformation molecules are enumerated. The statistical approach encompasses all possible molecules and conditions, not just those reported to date in chemical databases. The aim is to view the regioinformation expressed by molecules in an alternative and general way.

[1]  J. Bolognese,et al.  Sample size determination in combinatorial chemistry. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[2]  T. D. Schneider,et al.  Information content of individual genetic sequences. , 1997, Journal of theoretical biology.

[3]  Yoshito Kishi,et al.  Synthesis of Palytoxin from Palytoxin Carboxylic Acid , 1994 .

[4]  Steven H. Bertz,et al.  The first general index of molecular complexity , 1981 .

[5]  Jürgen Bajorath,et al.  Differential Shannon Entropy as a Sensitive Measure of Differences in Database Variability of Molecular Descriptors , 2001, J. Chem. Inf. Comput. Sci..

[6]  Daniel J. Graham,et al.  Base Information Content in Organic Formulas , 2000, J. Chem. Inf. Comput. Sci..

[7]  J. Bronn,et al.  Zur Bewältigung der Fachzeitschriften , 1918 .

[8]  T D Schneider,et al.  Measuring molecular information. , 1999, Journal of theoretical biology.

[9]  Jürgen Bajorath,et al.  Distinguishing between Natural Products and Synthetic Molecules by Descriptor Shannon Entropy Analysis and Binary QSAR Calculations , 2000, J. Chem. Inf. Comput. Sci..

[10]  Jürgen Bajorath,et al.  Variability of Molecular Descriptors in Compound Databases Revealed by Shannon Entropy Calculations , 2000, J. Chem. Inf. Comput. Sci..

[11]  Denis M. Bayada,et al.  Molecular Diversity and Representativity in Chemical Databases , 1999, J. Chem. Inf. Comput. Sci..

[12]  T. D. Schneider,et al.  Theory of molecular machines. I. Channel capacity of molecular machines. , 1991, Journal of theoretical biology.