Accurate Specification of Molecular Structures: The Case for Zero-Order Bonds and Explicit Hydrogen Counting

Most data structures used to represent molecular entities for cheminformatics are underspecified for purposes of representing nonorganic chemical species. Two extensions are proposed: allowing bond orders of 0 and adding an atom property to control the number of inferred attached hydrogen atoms. The case for these two extensions is made by demonstrating the effective representation of a number of unconventional bonding types that cannot be effectively represented by data structures currently in common use. A set of enhancements to the industry standard MDL CTfile format is proposed, which includes a backward compatibility mechanism to maximize interpretability by software that has not been updated to make use of the extensions.

[1]  R. K. Pomeroy,et al.  Properties of the pentacarbonyls of ruthenium and osmium , 1983 .

[2]  Y. Koide,et al.  Configuration-specific synthesis of the facial and meridional isomers of tris(8-hydroxyquinolinate)aluminum (Alq3). , 2006, Inorganic chemistry.

[3]  Axel Drefahl,et al.  CurlySMILES: a chemical language to customize and annotate encodings of molecular and nanodevice structures , 2011, J. Cheminformatics.

[4]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[5]  D. West Introduction to Graph Theory , 1995 .

[6]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the World Wide Web. 4. CML Schema , 2003, J. Chem. Inf. Comput. Sci..

[7]  R. Woodward,et al.  THE STRUCTURE OF IRON BIS-CYCLOPENTADIENYL , 1952 .

[8]  J. Gasteiger,et al.  Chemoinformatics: A Textbook , 2003 .

[9]  Evan Bolton,et al.  PubChem3D: a new resource for scientists , 2011, J. Cheminformatics.

[10]  I. J. Worrall,et al.  335. The crystal structure of gallium trichloride , 1965 .

[11]  B. Averill,et al.  Effects of Phenoxide Ligation on Iron-Sulfur Clusters. 2. Preparation and Properties of (Fe2S2(OAr)4)2- Ions , 1984 .

[12]  Alex M. Clark,et al.  2D Structure Depiction , 2006, J. Chem. Inf. Model..

[13]  E. Namdas,et al.  Photophysics of Fac-Tris(2-Phenylpyridine) Iridium(III) Cored Electroluminescent Dendrimers in Solution and Films , 2004 .

[14]  C. Reed,et al.  High yield C-derivatization of weakly coordinating carborane anions. , 2010, Inorganic chemistry.

[15]  S. Rettig,et al.  Reactions of the bis(dialkylphosphino)methane complexes Pd2X2(μ-R2PCH2PR2)2 (X = halogen, R = Me or Et) with H2S, S8, COS, and CS2; detection of reaction intermediates. , 2011, Inorganic chemistry.

[16]  A. Hepp,et al.  Unexpected Formation of Ga4C2H4 Heteroadamantane Cages by the Reaction of Carbon-Bridged Bis(dichlorogallium) Compounds with tert-Butyllithium , 2011 .

[17]  H. C. Aspinall,et al.  Cyanide ion as a four-electron donating bridging ligand in a dimanganese compound , 1984 .

[18]  Chris Morley,et al.  Open Babel: An open chemical toolbox , 2011, J. Cheminformatics.

[19]  W. H. Powell Treatment of variable valence in organic nomenclature (lambda convention) (Recommendations 1983) , 1984 .

[20]  P. Junk,et al.  Ether and crown ether adduct complexes of sodium and potassium cyclopentadienide and methylcyclopentadienide—molecular structures of [Na(dme)Cp]∞, [K(dme)0.5Cp]∞, [Na(15-crown-5)Cp], [Na(18-crown-6)CpMe] and the “naked Cp−” complex [K(15-crown-5)2][Cp] , 2002 .

[21]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the Worldwide Web. 1. Basic Principles , 1999, J. Chem. Inf. Comput. Sci..