Toxicity modeling and prediction with pattern recognition.

Empirical models can be constructed relating the change in toxicity to the change in chemical structure for series of similar compounds or mixtures. The first step is to translate the variation in structure to quantitative numbers. This gives a data table, a data matrix denoted by X, which then is analyzed. The same type of the models can be used to relate the variation of in vivo data to the variation of a battery of in vitro tests. A single data analytical model cannot be applied to a set of compounds of diverse chemical structure. For such data sets, separate models must be developed for each subgroup of compounds. The data analytical problem then partly is one of classification, pattern recognition (PARC). The assumption of structural and biological similarity within each subset of modeled compounds is then essential for empirical models to apply. PARC is often used to classify compounds as active (toxic) or inactive. The data structure is then often asymmetric which puts special demands on the data analysis, making the traditional PARC methods inapplicable. Depending on the desired information from the data analysis and on the type of available data, four levels of PARC can be distinguished: (I) the data X are used to develop rules for classifying future compounds into one of the classes represented in X; (II) same as I, but the possibility of future compounds belonging to "unknown" classes not represented in X is taken into account; (III) same as II, plus the quantitative prediction of one activity variable (here toxicity) in some classes; (IV) same as III, but several quantitative activity (toxicity) variables are predicted.

[1]  S. Wold,et al.  Simplified C‐13 NMR Parameters Related to the Carcinogenic Potency of Polycyclic Aromatic Hydrocarbons , 1983 .

[2]  S. A. bano C. D. nn W. I. i Wold,et al.  Pattern recognition: finding and using regularities in multivariate data Food research, how to relate sets of measurements or observations to each other , 1983 .

[3]  K Enslein,et al.  Teratogenesis: a statistical structure-activity model. , 1983, Teratogenesis, carcinogenesis, and mutagenesis.

[4]  A. Leo,et al.  Substituent constants for correlation analysis in chemistry and biology , 1979 .

[5]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[6]  Svante Wold,et al.  Multivariate quantitative structure-activity relationships (QSAR): conditions for their applicability , 1983, J. Chem. Inf. Comput. Sci..

[7]  S Wold,et al.  A structure-carcinogenicity study of 4-nitroquinoline 1-oxides using the SIMCA method of pattern recognition. , 1978, Journal of medicinal chemistry.

[8]  J. Topliss,et al.  Chance factors in studies of quantitative structure-activity relationships. , 1979, Journal of medicinal chemistry.

[9]  Svante Wold,et al.  Relationships between chemical structure and biological activity modeled by SIMCA pattern recognition , 1980 .

[10]  Svante Wold,et al.  MULTIVARIATE DATA ANALYSIS OF SUBSTITUENT DESCRIPTORS , 1983 .

[11]  Johann Gasteiger,et al.  Multivariate structure‐activity relationships between data from a battery of biological tests and an ensemble of structure descriptors: The PLS method , 1984 .

[12]  Svante Wold,et al.  The carcinogenicity of N-nitroso compounds: A SIMCA pattern recognition study , 1981 .

[13]  Johann Gasteiger,et al.  The Anesthetic Activity and Toxicity of Halogenated Ethyl Methyl Ethers, a Multivariate QSAR Modelled by PLS , 1985 .

[14]  Sidney Addelman,et al.  trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.

[15]  John F. Tinker Relating mutagenicity to chemical structure , 1981, J. Chem. Inf. Comput. Sci..

[16]  S Wold,et al.  Carcinogenicity of polycyclic aromatic hydrocarbons studied by SIMCA pattern recognition. , 1978, Acta chemica Scandinavica. Series B: Organic chemistry and biochemistry.

[17]  A. Verloop,et al.  Development and Application of New Steric Substituent Parameters in Drug Design , 1976 .

[18]  Herman Wold,et al.  Soft modelling: The Basic Design and Some Extensions , 1982 .

[19]  Voikhard Austel 2n‐Factorial Schemes in Drug Design Extensions Increasing Versatility , 1983 .

[20]  S. Wold,et al.  Clustering of aryl carbon-13 nuclear magnetic resonance substituent chemical shifts. A multivariate data analysis using principal components , 1983 .

[21]  P. Jurs,et al.  Computer-assisted structure-activity studies of chemical carcinogens. A heterogeneous data set. , 1979, Journal of medicinal chemistry.

[22]  S Wold,et al.  Structure-activity analyzed by pattern recognition: the asymmetric case. , 1980, Journal of medicinal chemistry.