New Data Structure for Computational Molecular Design with Atomic or Fragment Resolution

A new molecular data structure and molecular structure operation algorithms are proposed for general purpose molecular design. The data structure allows for a variety of molecular operations for creating new molecules. Two types of molecular operations were developed, uni-molecular and bi-molecular operations. In uni-molecular operations, a child molecule can be created from a parent via addition of a functional group, deletion of a fragment, mutation of an atom, etc. In bi-molecular operations, children molecules are generated from two parent molecules through combination or crossover (hybridization). These molecular operations are essential for the creation and modification of molecules for the purpose of molecular design. The data structure is capable of representing linear, branched, multifunctional, and multivalent compounds. Algorithms are developed for deriving the molecular data structure of a molecule from its atomic coordinates and vice versa. We show that this new molecular data structure and the developed algorithms, referred to as MARS (Molecular Assembling and Representation Suite), allow one to generate a comprehensive library of new molecules via performing every possible molecular structure modification.