lazar: a modular predictive toxicology framework

lazar (lazy structure–activity relationships) is a modular framework for predictive toxicology. Similar to the read across procedure in toxicological risk assessment, lazar creates local QSAR (quantitative structure–activity relationship) models for each compound to be predicted. Model developers can choose between a large variety of algorithms for descriptor calculation and selection, chemical similarity indices, and model building. This paper presents a high level description of the lazar framework and discusses the performance of example classification and regression models.

[1]  Nina Jeliazkova,et al.  AMBIT RESTful web services: an implementation of the OpenTox application programming interface , 2011, J. Cheminformatics.

[2]  Klaus-Robert Müller,et al.  Benchmark Data Set for in Silico Prediction of Ames Mutagenicity , 2009, J. Chem. Inf. Model..

[3]  Mohammad Al Hasan,et al.  ORIGAMI: Mining Representative Orthogonal Graph Patterns , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[4]  Ashwin Srinivasan,et al.  The Predictive Toxicology Evaluation Challenge , 1997, IJCAI.

[5]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[6]  Yun Chi,et al.  Frequent Subtree Mining - An Overview , 2004, Fundam. Informaticae.

[7]  D. Dix,et al.  The ToxCast program for prioritizing toxicity testing of environmental chemicals. , 2007, Toxicological sciences : an official journal of the Society of Toxicology.

[8]  Christoph Helma,et al.  Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity , 2006, Molecular Diversity.

[9]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[10]  G. Patlewicz,et al.  An evaluation of the implementation of the Cramer classification scheme in the Toxtree software , 2008, SAR and QSAR in environmental research.

[11]  C. Steinbeck,et al.  Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics. , 2006, Current pharmaceutical design.

[12]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo-and Bioinformatics , 2003, J. Chem. Inf. Comput. Sci..

[13]  Peter Willett,et al.  Promoting Access to White Rose Research Papers Effectiveness of Graph-based and Fingerprint-based Similarity Measures for Virtual Screening of 2d Chemical Structure Databases , 2022 .

[14]  A Maunz,et al.  Prediction of chemical toxicity with local support vector regression and activity-specific kernels , 2008, SAR and QSAR in environmental research.

[15]  Thomas Gärtner Kernfunktionen für Strukturierte Daten , 2005, Ausgezeichnete Informatikdissertationen.

[16]  C. Russom,et al.  Predicting modes of toxic action from chemical structure: Acute toxicity in the fathead minnow (Pimephales promelas) , 1997 .

[17]  Ting Chen,et al.  Local Lazy Regression: Making Use of the Neighborhood to Improve QSAR Predictions , 2006, J. Chem. Inf. Model..

[18]  Pantelis Sopasakis,et al.  Collaborative development of predictive toxicology applications , 2010, J. Cheminformatics.

[19]  P Willett,et al.  Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. , 2002, Combinatorial chemistry & high throughput screening.

[20]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[21]  M. E. Galassi,et al.  GNU SCIENTI C LIBRARY REFERENCE MANUAL , 2005 .

[22]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[23]  Luc De Raedt,et al.  Effective feature construction by maximum common subgraph sampling , 2010, Machine Learning.

[24]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[25]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[26]  Stefan Kramer,et al.  Latent Structure Pattern Mining , 2010, ECML/PKDD.

[27]  Kyoung Tai No,et al.  Prediction of Acute Toxicity to Fathead Minnow by Local Model Based QSAR and Global QSAR Approaches , 2012 .

[28]  Stefan Kramer,et al.  Efficient mining for structurally diverse subgraph patterns in large molecular databases , 2011, Machine Learning.