Generalized Single Class Discrimination (GSCD). A New Method for the Analysis of Embedded Structure‐Activity Relationships

Generalized Single Class Discrimination using Principal Component Analysis (GSCD-PCA) is a novel method for the analysis of embedded biological activity. It is applicable to the analysis of a continuous activity measure and is suitable for multivariate data sets. It developed as a logical extension of Single Class Discrimination, which we recently described for the analysis of classified embedded biological activity. 4 different GSCD-PCA algorithms are compared on artificial data sets containing parabolic and linear property-activity relationships. 2 examples on structure-activity data sets are given. The method performed well and produced stable, interpretable models.

[1]  Svante Wold,et al.  Relationships between chemical structure and biological activity modeled by SIMCA pattern recognition , 1980 .

[2]  John Wood,et al.  Single Class Discrimination Using Principal Component Analysis (SCD‐PCA) , 1991 .

[3]  W. Krzanowski,et al.  Cross-Validatory Choice of the Number of Components From a Principal Component Analysis , 1982 .

[4]  B. Kowalski,et al.  K-Nearest Neighbor Classification Rule (pattern recognition) applied to nuclear magnetic resonance spectral interpretation , 1972 .

[5]  V. S. Rose,et al.  Computer program suite for the calculation, storage and manipulation of molecular property and activity descriptors , 1987 .

[6]  R. M. Hyde Relationships between the biological and physicochemical properties of series of compounds. , 1975, Journal of medicinal chemistry.

[7]  James W. McFarland,et al.  Parabolic relation between drug potency and hydrophobicity , 1970 .

[8]  B. Kowalski,et al.  Partial least-squares regression: a tutorial , 1986 .

[9]  S. Wold Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models , 1978 .

[10]  C. Hansch,et al.  p-σ-π Analysis. A Method for the Correlation of Biological Activity and Chemical Structure , 1964 .

[11]  M. Otto Fuzzy theory explained , 1988 .

[12]  D. Livingstone,et al.  Structure-activity relationships of antifilarial antimycin analogues: a multivariate pattern recognition study. , 1990, Journal of medicinal chemistry.

[13]  S Wold,et al.  Structure-activity analyzed by pattern recognition: the asymmetric case. , 1980, Journal of medicinal chemistry.

[14]  H. Macfie,et al.  An application of unsupervised neural network methodology Kohonen topology-Preserving mapping) to QSAR analysis , 1991 .

[15]  Svante Wold,et al.  The carcinogenicity of N-nitroso compounds: A SIMCA pattern recognition study , 1981 .

[16]  S. Wold,et al.  Nonlinear PLS modeling , 1989 .

[17]  D J Gans,et al.  On the significance of clusters in the graphical display of structure-activity data. , 1986, Journal of medicinal chemistry.

[18]  D J Gans,et al.  Cluster significance analysis contrasted with three other quantitative structure-activity relationship methods. , 1987, Journal of medicinal chemistry.