MONA – Interactive manipulation of molecule collections

Working with small‐molecule datasets is a routine task forcheminformaticians and chemists. The analysis and comparison of vendorcatalogues and the compilation of promising candidates as starting pointsfor screening campaigns are but a few very common applications. Theworkflows applied for this purpose usually consist of multiple basiccheminformatics tasks such as checking for duplicates or filtering byphysico‐chemical properties. Pipelining tools allow to create andchange such workflows without much effort, but usually do not supportinterventions once the pipeline has been started. In many contexts, however,the best suited workflow is not known in advance, thus making it necessaryto take the results of the previous steps into consideration beforeproceeding.To support intuition‐driven processing of compound collections, wedeveloped MONA, an interactive tool that has been designed to prepare andvisualize large small‐molecule datasets. Using an SQL database commoncheminformatics tasks such as analysis and filtering can be performedinteractively with various methods for visual support. Great care was takenin creating a simple, intuitive user interface which can be instantly usedwithout any setup steps. MONA combines the interactivity of moleculedatabase systems with the simplicity of pipelining tools, thus enabling thecase‐to‐case application of chemistry expert knowledge. Thecurrent version is available free of charge for academic use and can bedownloaded at http://www.zbh.uni-hamburg.de/mona.

[1]  David S. Wishart,et al.  DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs , 2010, Nucleic Acids Res..

[2]  Maurizio Vichi,et al.  Studies in Classification Data Analysis and knowledge Organization , 2011 .

[3]  Stephen R. Heller,et al.  InChI - the worldwide chemical structure identifier standard , 2013, Journal of Cheminformatics.

[4]  Matthias Rarey,et al.  NAOMI: On the Almost Trivial Task of Reading Molecules from Different File formats , 2011, J. Chem. Inf. Model..

[5]  Michael M. Mysinger,et al.  Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking , 2012, Journal of medicinal chemistry.

[6]  Wendy A. Warr,et al.  Scientific workflow systems: Pipeline Pilot and KNIME , 2012, Journal of Computer-Aided Molecular Design.

[7]  Matthias Rarey,et al.  Reading PDB: Perception of Molecules from 3D Atomic Coordinates , 2013, J. Chem. Inf. Model..

[8]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[9]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[10]  Zukang Feng,et al.  Ligand Depot: a data warehouse for ligands bound to macromolecules , 2004, Bioinform..

[11]  Yanli Wang,et al.  PubChem: Integrated Platform of Small Molecules and Biological Activities , 2008 .

[12]  F. Lombardo,et al.  Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. , 2001, Advanced drug delivery reviews.

[13]  Gordon M. Crippen,et al.  Prediction of Physicochemical Parameters by Atomic Contributions , 1999, J. Chem. Inf. Comput. Sci..

[14]  Thorsten Meinl,et al.  KNIME: The Konstanz Information Miner , 2007, GfKl.

[15]  M. Kappler Software for rapid prototyping in the pharmaceutical and biotechnology industries. , 2008, Current opinion in drug discovery & development.