Representation of chemical structures

At the root of applications for substructure and similarity searching, reaction retrieval, synthesis planning, drug discovery, and physicochemical property prediction is the need for a machine‐readable representation of a structure. Systematic nomenclature is unsuitable, and notations and fragment codes have been superseded, except in certain specific applications. Connection tables are widely used, but there is no formal standard. Recently the International Union of Pure and Applied Chemistry (IUPAC) International Chemical Identifier (InChI) has started to attract interest. This review also summarizes the representation of chemical reactions and three‐dimensional structures. © 2011 John Wiley & Sons, Ltd. WIREs Comput Mol Sci 2011 1 557–579 DOI: 10.1002/wcms.36

[1]  Jacques-Emile Dubois Chemical complexity and molecular topology: The DARC concepts and applications , 2008 .

[2]  Michael F. Lynch,et al.  Production of Printed Indexes of Chemical Reactions Using Wiswesser Line Notations , 1978, J. Chem. Inf. Comput. Sci..

[3]  Richard A. Lewis,et al.  Three-dimensional pharmacophore methods in drug discovery. , 2010, Journal of medicinal chemistry.

[4]  James G. Nourse,et al.  Structure searching in chemical databases by direct lookup methods , 1993, J. Chem. Inf. Comput. Sci..

[5]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the World Wide Web. 6. CMLReact, an XML Vocabulary for Chemical Reactions , 2006, J. Chem. Inf. Model..

[6]  Alex M. Clark,et al.  2D Structure Depiction , 2006, J. Chem. Inf. Model..

[7]  Peter Willett The Effect of Screen Set Size on Retrieval from Chemical Substructure Search Systems , 1979 .

[8]  John M. Barnard,et al.  Computer representation and manipulation of combinatorial libraries , 1996 .

[9]  Michael F. Lynch,et al.  A modified IUPAC-Dyson notation system for chemical structures , 1968, Inf. Storage Retr..

[10]  Peter Willett,et al.  Automated Descriptor Selection and Hyperstructure Generation to Assist SAR Studies , 1995 .

[11]  Edward S. Wilks Polymer Nomenclature and Structure: A Comparison of Systems Used by CAS, IUPAC, MDL, and DuPont, 2. Aftertreated (Post-treated), Alternating/Periodic, and Block Polymers , 1997, J. Chem. Inf. Comput. Sci..

[12]  Robert Fugmann,et al.  The supply of information on chemical reactions in the IDC system , 1979, Inf. Process. Manag..

[13]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 7. Parallel simulation of a relaxation algorithm for chemical substructure search , 1986, Journal of chemical information and computer sciences.

[14]  Nick A. Farmer,et al.  The CAS ONLINE search system. 1. General system design and selection, generation, and use of search screens , 1983, J. Chem. Inf. Comput. Sci..

[15]  Helen Schofield,et al.  A Framework for the Evaluation of Chemical Structure Databases , 2001, J. Chem. Inf. Comput. Sci..

[16]  Norman R. Schmuff,et al.  A comparison of the MARPAT and Markush DARC software , 1991, J. Chem. Inf. Comput. Sci..

[17]  Frank H. Allen,et al.  Cambridge Structural Database , 2002 .

[18]  Hans Matter,et al.  Computational Medicinal Chemistry for Drug Discovery , 2004 .

[19]  Wendy A. Warr,et al.  Diverse uses and future prospects for Wiswesser line-formula notation , 1982, J. Chem. Inf. Comput. Sci..

[20]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[21]  David Weininger,et al.  SMILES. 2. Algorithm for generation of unique SMILES notation , 1989, J. Chem. Inf. Comput. Sci..

[22]  Thomas Engel,et al.  Basic Overview of Chemoinformatics , 2006, J. Chem. Inf. Model..

[23]  Johann Gasteiger,et al.  COMPUTER-ASSISTED DESIGN OF SYNTHESES FOR HETEROCYCLIC COMPOUNDS , 1995 .

[24]  Malcolm J. McGregor,et al.  Clustering of Large Databases of Compounds: Using the MDL "Keys" as Structural Descriptors , 1997, J. Chem. Inf. Comput. Sci..

[25]  Andrew Smellie,et al.  Analysis of Conformational Coverage, 1. Validation and Estimation of Coverage , 1995, J. Chem. Inf. Comput. Sci..

[26]  Wendy A. Warr Chemical Structure Information Systems: Interfaces, Communication, and Standards , 1989 .

[27]  Wendy A. Warr,et al.  Commercial software systems for diversity analysis , 1996 .

[28]  P. Bador,et al.  LES SYSTEMES INFORMATIQUES DE RECHERCHE D'INFORMATION SUR LES REACTIONS CHIMIQUES ET LES SYSTEMES DE SYNTHESE ASSISTEE PAR ORDINATEUR , 1992 .

[29]  Roger A. Sayle Foreign Language Translation of Chemical Nomenclature by Computer , 2009, J. Chem. Inf. Model..

[30]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents, 3. Chemical grammars and their role in the manipulation of chemical structures , 1981, J. Chem. Inf. Comput. Sci..

[31]  Andrew Smellie,et al.  Poling: Promoting conformational variation , 1995, J. Comput. Chem..

[32]  Henry S. Rzepa,et al.  Chemical Markup, XML and the World-Wide Web. 2. Information Objects and the CMLDOM , 2001, J. Chem. Inf. Comput. Sci..

[33]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 14. Fragment generation from generic structures , 1992, J. Chem. Inf. Comput. Sci..

[34]  Egon L. Willighagen,et al.  The Blue Obelisk—Interoperability in Chemical Informatics , 2006, J. Chem. Inf. Model..

[35]  J. A. Grant,et al.  A shape-based 3-D scaffold hopping method and its application to a bacterial protein-protein interaction. , 2005, Journal of medicinal chemistry.

[36]  Eugene Garfield,et al.  The Index Chemicus Registry System - past, present and future , 1977 .

[37]  Jürgen Vogt Chemische Nomenklatur per Mausklick , 2005 .

[38]  Johann Gasteiger,et al.  Overcoming the Limitations of a Connection Table Description: A Universal Representation of Chemical Species , 1997, J. Chem. Inf. Comput. Sci..

[39]  Peter Willett,et al.  Maximum common subgraph isomorphism algorithms for the matching of chemical structures , 2002, J. Comput. Aided Mol. Des..

[40]  Robert P. Sheridan,et al.  Using CONCORD to construct a large database of three-dimensional coordinates from connection tables , 1989, J. Chem. Inf. Comput. Sci..

[41]  Claus Suhr A change of paradigms: looking back to the pioneer years of patent information management (1960-1990) , 2004 .

[42]  William J. Wiswesser,et al.  The Wiswesser line-formula chemical notation , 1968 .

[43]  Johann Gasteiger,et al.  Chemoinformatics: a new field with a long tradition , 2006, Analytical and bioanalytical chemistry.

[44]  John M. Barnard,et al.  Substructure searching methods: Old and new , 1993, J. Chem. Inf. Comput. Sci..

[45]  Antony J Williams,et al.  Internet-based tools for communication and collaboration in chemistry. , 2008, Drug discovery today.

[46]  Yvonne C. Martin,et al.  ALADDIN: An integrated tool for computer-assisted molecular design and pharmacophore recognition from geometric, steric, and substructure searching of three-dimensional molecular structures , 1989, J. Comput. Aided Mol. Des..

[47]  Louis Hodes,et al.  Selection of Descriptors According to Discrimination and Redundancy. Application to Chemical Structure Searching , 1976, J. Chem. Inf. Comput. Sci..

[48]  John M. Barnard,et al.  Chemical Similarity Searching , 1998, J. Chem. Inf. Comput. Sci..

[49]  Anne W. Ryan,et al.  Chemical Abstracts Service chemical registry system. 9. Input structure conventions , 1982, J. Chem. Inf. Comput. Sci..

[50]  Edward S. Wilks Polymer Nomenclature and Structure: A Comparison of Systems Used by CAS, IUPAC, MDL, and DuPont, 3. Comb/Graft, Cross-Linked, and Dendritic/Hyperconnected/Star Polymers , 1997, J. Chem. Inf. Comput. Sci..

[51]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[52]  H.-G. Rohbeck Representation of Structure Description Arranged Linearly , 1991 .

[53]  Gavin Harper,et al.  Training Similarity Measures for Specific Activities: Application to Reduced Graphs , 2006, J. Chem. Inf. Model..

[54]  Peter Willett,et al.  The Evaluation of an Automatically Indexed, Machine-Readable Chemical Reactions File , 1980, Journal of chemical information and computer sciences.

[55]  Paul M. Selzer,et al.  The Impact of Tautomer Forms on Pharmacophore-Based Virtual Screening , 2006, J. Chem. Inf. Model..

[56]  Lois E. Fritts,et al.  Using the Wiswesser line notation (WLN) for online, interactive searching of chemical structures , 1982, Journal of chemical information and computer sciences.

[57]  Peter Willett,et al.  Modern approaches to chemical reaction searching : proceedings of a conference , 1986 .

[58]  Johann Gasteiger,et al.  Chemical structure representation for information exchange , 2002, Online Inf. Rev..

[59]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 11. Theoretical aspects of the use of structure languages in a retrieval system , 1991, J. Chem. Inf. Comput. Sci..

[60]  Philip Judson,et al.  Knowledge-based expert systems in chemistry : not counting on computers , 2009 .

[61]  Andrew H. Berks,et al.  Current State of the Art of Markush Topological Search Systems , 2001 .

[62]  Stephen R. Heller Beilstein System: Strategies for Effective Searching , 1998 .

[63]  Hajime Tokuno Comparison of Markush structure databases , 1993, J. Chem. Inf. Comput. Sci..

[64]  James E. Blake,et al.  CASREACT: more than a million reactions , 1990, J. Chem. Inf. Comput. Sci..

[65]  A Sheng,et al.  Hoffmann-La Roche's on-line batch interactive chemical information system. , 1974, Journal of chemical documentation.

[66]  Gernot A. Eller,et al.  Improving the quality of published chemical names with nomenclature software. , 2006, Molecules.

[67]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents, 2. GENSAL, a formal language for the description of generic chemical structures , 1981, J. Chem. Inf. Comput. Sci..

[68]  F. Allen The Cambridge Structural Database: a quarter of a million crystal structures and rising. , 2002, Acta crystallographica. Section B, Structural science.

[69]  Gerald G. Vander Stouw,et al.  The Chemical Abstracts Service Chemical Registry System. IV. Use of the Registry System to Support the Preparation of Index Nomenclature , 1976, J. Chem. Inf. Comput. Sci..

[70]  Yvonne C. Martin,et al.  MENTHOR, a database system for the storage and retrieval of three-dimensional molecular structures and associated data searchable by substructural, biologic, physical, or geometric properties , 1988, J. Comput. Aided Mol. Des..

[71]  Kathleen A. Cloutier A comparison of three online Markush databases , 1991, J. Chem. Inf. Comput. Sci..

[72]  Wolf-Dietrich Ihlenfeldt,et al.  Computation and management of chemical properties in CACTVS: An extensible networked approach toward modularity and compatibility , 1994, J. Chem. Inf. Comput. Sci..

[73]  Michael F. Lynch,et al.  Distribution of Fragment Representations in a Chemical Substructure Search Screening System , 1974 .

[74]  Wendy A. Warr,et al.  Tautomerism in chemical information management systems , 2010, J. Comput. Aided Mol. Des..

[75]  J M Barnard,et al.  Use of Markush structure analysis techniques for descriptor generation and clustering of large combinatorial libraries. , 2000, Journal of molecular graphics & modelling.

[76]  Edward S. Wilks Polymer Nomenclature and Structure: A Comparison of Systems Used by CAS, IUPAC, MDL, and DuPont, 1. Regular Single-Strand Organic Polymers , 1997, J. Chem. Inf. Comput. Sci..

[77]  Peter Willett,et al.  Use of Reduced Graphs To Encode Bioisosterism for Similarity-Based Virtual Screening , 2009, J. Chem. Inf. Model..

[78]  N. Null The IUPAC International Chemical Identifier (InChI) , 2009 .

[79]  Robert E. Stobaugh The Chemical Abstracts Service Chemical Registry System. VI. Substance-Related Statistics , 1980, J. Chem. Inf. Comput. Sci..

[80]  J. Gasteiger,et al.  Computer-assisted synthesis and reaction planning in combinatorial chemistry , 2000 .

[81]  Henry S. Rzepa,et al.  A global resource for computational chemistry , 2005, Journal of molecular modeling.

[82]  David W. Weisgerber,et al.  Chemical Abstracts Service Chemical Registry System: history, scope, and impacts , 1997 .

[83]  Johann Gasteiger,et al.  Computer‐Assisted Planning of Organic Syntheses: The Second Generation of Programs , 1996 .

[84]  J Gasteiger,et al.  Decision support systems for chemical structure representation, reaction modeling, and spectra simulation , 2002, SAR and QSAR in environmental research.

[85]  John M. Barnard,et al.  Chemical patents and structural information - the Sheffield research in context , 1998, J. Documentation.

[86]  Robert D. Clark,et al.  SYBYL Line Notation (SLN): A Single Notation To Represent Chemical Structures, Queries, Reactions, and Virtual Libraries , 2008, J. Chem. Inf. Model..

[87]  John M. Barnard,et al.  Draft specification for revised version of the Standard Molecular Data (SMD) Format , 1990, J. Chem. Inf. Comput. Sci..

[88]  Alan H. Lipkus,et al.  Chemical Abstracts Service Chemical Registry System. 13. Enhanced handling of stereochemistry , 1991, J. Chem. Inf. Comput. Sci..

[89]  Dana L. Roth SPRESIweb 2.1, a Selective Chemical Synthesis and Reaction Database , 2005, J. Chem. Inf. Model..

[90]  John Bradshaw,et al.  Similarity Searching Using Reduced Graphs , 2003, J. Chem. Inf. Comput. Sci..

[91]  W. H. Powell,et al.  A guide to IUPAC nomenclature of organic compounds : recommendations 1993 , 1994 .

[92]  Valerie J. Gillet,et al.  Computer storage and retrieval of generic chemical structures in patents. 12. Principles of search operations involving parameter lists: matching-relations, user-defined match levels, and transition from the reduced graph search to the refined search , 1991, J. Chem. Inf. Comput. Sci..

[93]  Igor V. Filippov,et al.  Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source Solution , 2009, J. Chem. Inf. Model..

[94]  W. H. Powell,et al.  Corrections to 'A Guide to IUPAC Nomenclature of Organic Compounds' , 1999 .

[95]  Markus Wagener,et al.  Potential Drugs and Nondrugs: Prediction and Identification of Important Structural Features , 2000, J. Chem. Inf. Comput. Sci..

[96]  P. Willett,et al.  Pharmacophoric pattern matching in files of 3d chemical structures: comparison of geometric searching algorithms , 1987 .

[97]  Pierre Benichou,et al.  Handling Genericity in Chemical Structures Using the Markush Darc Software , 1997, J. Chem. Inf. Comput. Sci..

[98]  Antony J. Williams,et al.  A perspective of publicly accessible/open-access chemistry databases. , 2008, Drug discovery today.

[99]  William Lingran Chen,et al.  Chemoinformatics: Past, Present, and Future† , 2006, J. Chem. Inf. Model..

[100]  Herman Skolnik,et al.  A Notation System for Indexing Pesticides. , 1964 .

[101]  G. Grisetti,et al.  Further Reading , 1984, IEEE Spectrum.

[102]  S H Bryant,et al.  Structure databases. , 1998, Methods of biochemical analysis.

[103]  John M. Barnard,et al.  A Universal Structure/Substructure Representation for PC-Host Communication , 1989 .

[104]  John M. Barnard,et al.  Towards in-house searching of Markush structures from patents☆ , 2009 .

[105]  C. Gregory Paris Databases of Chemical Structures , 2008 .

[106]  Edward S. Wilks,et al.  Nomenclature and Structural Representation for Linear, Single-Strand Polymers Aftertreated to Hyperconnected Networks , 1996, J. Chem. Inf. Comput. Sci..

[107]  Eugene Garfield,et al.  Index Chemicus Registry System: Pragmatic Approach to Substructure Chemical Retrieval , 1970 .

[108]  Faiz A. Parkar,et al.  Comparison of Beilstein CrossFirePlusReactions and the Selective Reaction Databases under ISIS , 1999, J. Chem. Inf. Comput. Sci..

[109]  Edward S. Wilks Polymer Nomenclature and Structure: A Comparison of Systems Used by CAS, IUPAC, MDL, and DuPont, 4. Stereochemistry, Inorganic, Coordination, Double-Strand, Polysiloxanes, Oligomers, Telomers , 1997, J. Chem. Inf. Comput. Sci..

[110]  James G. Nourse,et al.  The substance module: the representation, storage, and searching of complex structures , 1991, J. Chem. Inf. Comput. Sci..

[111]  Andrew Smellie,et al.  Analysis of Conformational Coverage, 2. Applications of Conformational Models , 1995, J. Chem. Inf. Comput. Sci..

[112]  H. Bebak,et al.  The standard molecular data format (SMD format) as an integration tool in computer chemistry , 1989, J. Chem. Inf. Comput. Sci..

[113]  Tad Hurst,et al.  Flexible 3D searching: The directed tweak technique , 1994, J. Chem. Inf. Comput. Sci..

[114]  A. Peter Johnson,et al.  CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition , 2009, J. Chem. Inf. Model..

[115]  Michael F. Lynch,et al.  Computer Storage and Retrieval of Generic Chemical Structures in Patents, 17. Evaluation of the Refined Search , 1995, J. Chem. Inf. Comput. Sci..

[116]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 15. Generation of topological fragment descriptors from nontopological representations of generic structure components , 1993, J. Chem. Inf. Comput. Sci..

[117]  Peter Willett,et al.  A Screen Set Generation Algorithm , 1979, J. Chem. Inf. Comput. Sci..

[118]  D. Banville Mining chemical structural information from the drug literature. , 2006, Drug discovery today.

[119]  Peter Murray-Rust,et al.  Development of chemical markup language (CML) as a system for handling complex chemical content , 2001 .

[120]  F. Allen,et al.  The crystallographic information file (CIF) : a new standard archive file for crystallography , 1991 .

[121]  L. GOEBELS,et al.  AUTONOM: system for computer translation of structural diagrams into IUPAC-compatible names. 2. Nomenclature of chains and rings , 1991, J. Chem. Inf. Comput. Sci..

[122]  P. Judson,et al.  Knowledge-Based Expert Systems in Chemistry , 2009 .

[123]  Juliane Fluck,et al.  Identification of new drug classification terms in textual resources , 2007, ISMB/ECCB.

[124]  Patricia S. Wilson,et al.  The Chemical Abstracts Service generic chemical (Markush) structure storage and retrieval capability. 2. The MARPAT file , 1991, J. Chem. Inf. Comput. Sci..

[125]  Michael F. Lynch,et al.  Computer storage and retrieval of generic structures in chemical patents. 4. An extended connection table representation for generic structures , 1982, J. Chem. Inf. Comput. Sci..

[126]  E J Corey,et al.  Computer-assisted design of complex organic syntheses. , 1969, Science.

[127]  John M. Barnard A comparison of different approaches to Markush structure handling , 1991, J. Chem. Inf. Comput. Sci..

[128]  M C Nicklaus,et al.  Internet resources integrating many small-molecule databases1 , 2008, SAR and QSAR in environmental research.

[129]  Michael F. Lynch,et al.  The Automatic Detection of Chemical Reaction Sites , 1978, J. Chem. Inf. Comput. Sci..

[130]  J Gasteiger,et al.  A combined application of reaction prediction and infrared spectra simulation for the identification of degradation products of s-triazine herbicides. , 2001, Chemistry.

[131]  C. Gregory Paris,et al.  Chemical Structure Handling by Computer. , 1997 .

[132]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 5. Algorithmic generation of fragment descriptors for generic structure screening , 1984, J. Chem. Inf. Comput. Sci..

[133]  Andrew Smellie,et al.  Identification of Common Functional Configurations Among Molecules , 1996, J. Chem. Inf. Comput. Sci..

[134]  Tudor I. Oprea,et al.  An automated PLS search for biologically relevant QSAR descriptors , 2004, J. Comput. Aided Mol. Des..

[135]  J. Gasteiger,et al.  Enabling the exploration of biochemical pathways. , 2004, Organic & biomolecular chemistry.

[136]  Peter Willett,et al.  Three-dimensional chemical structure handling , 1991 .

[137]  William Fisanick,et al.  The Chemical Abstract's Service generic chemical (Markush) structure storage and retrieval capability. 1. Basic concepts , 1990, J. Chem. Inf. Comput. Sci..

[138]  Johann Gasteiger,et al.  Assessing Similarity and Diversity of Combinatorial Libraries by Spatial Autocorrelation Functions and Neural Networks , 1996 .

[139]  Philip E. Bourne,et al.  [30] Macromolecular crystallographic information file , 1997 .

[140]  Yuzuru Fujiwara,et al.  Computer representation of generic chemical structures by an extended block-cutpoint tree , 1983, J. Chem. Inf. Comput. Sci..

[141]  A. A. Verrijn Stuart,et al.  Documentation of Chemical Reactions. II. Analysis of the Wiswesser Line Notation , 1974 .

[142]  Michael F. Lynch,et al.  Strategic Considerations in the Design of a Screening System for Substructure Searches of Chemical Structure Files , 1973 .

[143]  Peter Willett,et al.  A history of chemoinformatics , 2003 .

[144]  John D Westbrook,et al.  The PDB format, mmCIF, and other data formats. , 2003, Methods of biochemical analysis.

[145]  John M. Barnard The Standard Molecular Data (SMD) Format , 1993 .

[146]  GEOFFREY M. DOWNS,et al.  Computer storage and retrieval of generic chemical structures in patents. 9. An algorithm to find the extended set of smallest rings in structurally explicit generics , 1989, J. Chem. Inf. Comput. Sci..

[147]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 13. Reduced graph generation , 1991, J. Chem. Inf. Comput. Sci..

[148]  J. Brecher Graphical representation of stereochemical configuration (IUPAC Recommendations 2006) , 2006 .

[149]  John M. Barnard,et al.  The Molecular Information File (MIF): Core Specifications of a New Standard Format for Chemical Data , 1995, J. Chem. Inf. Comput. Sci..

[150]  Peter Willett,et al.  Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis , 2003, J. Inf. Sci..

[151]  Holger Lenz,et al.  Automatic Translation of GENSAL Representations of Markush Structures into Gremas Fragment Codes at IDC , 1993 .

[152]  Henry S. Rzepa,et al.  SemanticEye: A Semantic Web Application to Rationalize and Enhance Chemical Electronic Publishing , 2006, J. Chem. Inf. Model..

[153]  David Flaxbart Handbook of Chemoinformatics: From Data to Knowledge, Volumes 1−4 Edited by Johann Gasteiger (University of Erlangen-Nürnberg). Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim. 2003. xlvii + 1870 pp. $750.00. ISBN 3-527-30680-3. , 2004 .

[154]  James G. Nourse,et al.  Reoptimization of MDL Keys for Use in Drug Discovery , 2002, J. Chem. Inf. Comput. Sci..

[155]  Johann Gasteiger The challenge of molecular structure representation for property prediction , 2008 .

[156]  Peter Willett,et al.  Designing bioactive molecules : three-dimensional techniques and applications , 1998 .

[157]  P. Willett,et al.  PHARMACOPHORE PERCEPTION , DEVELOPMENT , AND USE IN DRUG DESIGN , 2011 .

[158]  Trisha M. Johns,et al.  Wiswesser line notation as a structural summary medium , 1982, J. Chem. Inf. Comput. Sci..

[159]  W. Graf,et al.  The third BASIC fragment search dictionary , 1982, J. Chem. Inf. Comput. Sci..

[160]  Alexander Wlodawer,et al.  Application of InChI to curate, index, and query 3‐D structures , 2005, Proteins.

[161]  Olga Kennard,et al.  Cambridge Crystallographic Data Centre. I. Bibliographic File. , 1972 .

[162]  R. Webster Homer,et al.  SYBYL Line Notation (SLN): A Versatile Language for Chemical Structure Representation , 1997, J. Chem. Inf. Comput. Sci..

[163]  Edward S. Wilks,et al.  A Nomenclature and Structural Representation System for Asymmetrical "I"-Shaped Hyperbranched Polymers , 1996, J. Chem. Inf. Comput. Sci..

[164]  Jeff Morris,et al.  Further Development of Reduced Graphs for Identifying Bioactive Compounds , 2003, J. Chem. Inf. Comput. Sci..

[165]  C. Wermuth,et al.  Glossary of terms used in medicinal chemistry (IUPAC Recommendations 1998) , 1998 .

[166]  Juergen Sander,et al.  Structure searches in patent literature: A comparison study between IDC GREMAS and Derwent Chemical Code , 1991, J. Chem. Inf. Comput. Sci..

[167]  Peter Murray-Rust,et al.  High-Throughput Identification of Chemistry in Life Science Texts , 2006, CompLife.

[168]  Johann Gasteiger,et al.  3D Structure Generation and Conformational Searching , 2003 .

[169]  J. Gasteiger,et al.  Automatic generation of 3D-atomic coordinates for organic molecules , 1990 .

[170]  John M. Barnard,et al.  Chemical Fragment Generation and Clustering Software , 1997, J. Chem. Inf. Comput. Sci..

[171]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 6. An interpreter program for the generic structure description language GENSAL , 1984, J. Chem. Inf. Comput. Sci..

[172]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the Worldwide Web. 1. Basic Principles , 1999, J. Chem. Inf. Comput. Sci..

[173]  Peter Willett,et al.  Pharmacophoric pattern matching in files of 3-D chemical structures: election of interatomic distance screens , 1986 .

[174]  Corinna Kolárik,et al.  Information extraction in the life sciences: perspectives for medicinal chemistry, pharmacology and toxicology. , 2005, Current topics in medicinal chemistry.

[175]  S. Barrie Walker Development of CAOCI and its use in ICI plant protection division , 1983, J. Chem. Inf. Comput. Sci..

[176]  D. I. Cooke-Fox,et al.  Computer translation of IUPAC systematic organic chemical nomenclature. 4. Concise connection tables to structure diagrams , 1990, J. Chem. Inf. Comput. Sci..

[177]  Yvonne C. Martin,et al.  Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection , 1996, J. Chem. Inf. Comput. Sci..

[178]  Peter Willett,et al.  Selection of screens for three-dimensional substructure searching , 1990 .

[179]  G. W. Gibson,et al.  THE WISWESSER LINE-NOTATION: AN INTRODUCTION. , 1965 .

[180]  Johann Gasteiger,et al.  Simulation of Organic Reactions: From the Degradation of Chemicals to Combinatorial Synthesis , 2000, J. Chem. Inf. Comput. Sci..

[181]  Joannis Apostolakis,et al.  Automatic Determination of Reaction Mappings and Reaction Center Information. 1. The Imaginary Transition State Energy Approach , 2008, J. Chem. Inf. Model..

[182]  Andy Vinter,et al.  Molecular Field Extrema as Descriptors of Biological Activity: Definition and Validation , 2006, J. Chem. Inf. Model..

[183]  Denis M. Bayada,et al.  Molecular Diversity and Representativity in Chemical Databases , 1999, J. Chem. Inf. Comput. Sci..

[184]  Andreas Barth,et al.  Status and future developments of reaction databases and online retrieval systems , 1990, J. Chem. Inf. Comput. Sci..

[185]  Clemens Jochum The Beilstein Information System is not a reaction database, or is it? , 1994, J. Chem. Inf. Comput. Sci..

[186]  P E Bourne,et al.  Macromolecular Crystallographic Information File. , 1997, Methods in enzymology.

[187]  Gareth Jones,et al.  Pharmacophoric pattern matching in files of three-dimensional chemical structures: Comparison of conformational-searching algorithms for flexible searching , 1994, J. Chem. Inf. Comput. Sci..

[188]  Jacques-Emile Dubois,et al.  Substructure systems: concepts and classifications , 1990, J. Chem. Inf. Comput. Sci..

[189]  Michael F. Lynch,et al.  Current research into chemical and textual information retrieval at the department of information studies, University of Sheffield , 1987, Inf. Process. Manag..

[190]  J. Gasteiger,et al.  Computer-assisted reaction prediction and synthesis design , 1990 .

[191]  Peter Willett,et al.  Hyperstructure model for chemical structure handling: generation and atom-by-atom searching of hyperstructures , 1992, J. Chem. Inf. Comput. Sci..

[192]  Valerie J. Gillet,et al.  Computer storage and retrieval of generic chemical structures in patents. 8. Reduced chemical graphs and their applications in generic chemical structure retrieval , 1987, J. Chem. Inf. Comput. Sci..

[193]  David Bawden,et al.  Pharmacophoric pattern matching in files of 3d chemical structures: evaluation of search performance , 1987 .

[194]  Frank H. Allen,et al.  The Cambridge Crystallographic Database , 2007 .

[195]  Robert E. Stobaugh,et al.  The Chemical Abstracts Service Chemical Registry System. III. Stereochemistry , 1977, J. Chem. Inf. Comput. Sci..

[196]  James Dugundji,et al.  An algebraic model of constitutional chemistry as a basis for chemical computer programs , 1973 .

[197]  Robert E. Stobaugh Chemical Abstracts Service Chemical Registry System. 11. Substance-related statistics: update and additions , 1988, J. Chem. Inf. Comput. Sci..

[198]  Peter Murray-Rust,et al.  A universal approach to web-based chemistry using XML and CML , 2000 .

[199]  Alan H. Lipkus,et al.  Chemical Abstracts Service Chemical Registry System. Part 13. Enhanced Handling of Stereochemistry. , 2010 .

[200]  Diane R. Eakin Graphics challenge WLN. Can WLN hold fast? , 1982, J. Chem. Inf. Comput. Sci..

[201]  Robert Fugmann,et al.  GREDIA: A new access to GREMAS databases , 1989 .

[202]  Janusz L. Wisniewski Nomenclature: Automatic Generation and Conversion , 2002 .

[203]  J. P. Moosemiller,et al.  The Chemical Abstracts Service Chemical Registry System. VIII. Manual Registration , 1980, J. Chem. Inf. Comput. Sci..

[204]  Joe R. McDaniel,et al.  Kekule: OCR-optical chemical (structure) recognition , 1992, J. Chem. Inf. Comput. Sci..

[205]  Graham Palmer,et al.  The use of computers with chemical structural information: ICI CROSSBOW system , 1974 .

[206]  Carlos M. Bowman,et al.  Applications of the Wiswesser line notation at the Dow Chemical Company , 1982, J. Chem. Inf. Comput. Sci..

[207]  Peter Willett,et al.  Similarity methods in chemoinformatics , 2009, Annu. Rev. Inf. Sci. Technol..

[208]  Henry S. Rzepa,et al.  Chemical Markup, XML and the World-Wide Web. 8. Polymer Markup Language , 2008, J. Chem. Inf. Model..

[209]  A. H. Lipkus,et al.  Structural Diversity of Organic Chemistry. a Scaffold Analysis of the Cas Registry , 2022 .

[210]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the World-Wide Web. 3. Toward a Signed Semantic Chemical Web of Trust , 2001, J. Chem. Inf. Comput. Sci..

[211]  Gerhard Klebe,et al.  Comparison of Automatic Three-Dimensional Model Builders Using 639 X-ray Structures , 1994, J. Chem. Inf. Comput. Sci..

[212]  H. Berman The Protein Data Bank: a historical perspective. , 2008, Acta crystallographica. Section A, Foundations of crystallography.

[213]  Debra L. Banville,et al.  Chemical information mining : facilitating literature-based discovery , 2008 .

[214]  Queen Mary,et al.  CORRECTIONS TO A GUIDE TO IUPAC NOMENCLATURE OF ORGANIC COMPOUNDS (IUPAC RECOMMENDATIONS 1993) , 1999 .

[215]  J. E. Ash,et al.  Communication, Storage and Retrieval of Chemical Information , 1985 .

[216]  Martin A. Ott,et al.  Cheminformatics and Organic Chemistry. Computer-Assisted Synthetic Analysis , 2004 .

[217]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents, 1. Introduction and general strategy , 1981, J. Chem. Inf. Comput. Sci..

[218]  Antonio Zamora,et al.  The Chemical Abstracts Service Chemical Registry System. V. Structure Input and Editing , 1976, J. Chem. Inf. Comput. Sci..

[219]  Andrey Yerin,et al.  The Need for Systematic Naming Software Tools for Exchange of Chemical Information , 1999 .

[220]  Michael F. Lynch,et al.  A Qualitative Comparison of Wiswesser Line Notation Descriptors of Reactions and the Derwent Chemical Reaction Documentation Service , 1979, Journal of chemical information and computer sciences.

[221]  Robert E. Stobaugh,et al.  The Chemical Abstracts Service Chemical Registry System. I. General Design , 1976, J. Chem. Inf. Comput. Sci..

[222]  Robert N. Wilke Searching for simple generic structures , 1991, J. Chem. Inf. Comput. Sci..

[223]  Andreas Dietz,et al.  Yet Another Representation of Molecular Structure , 1995, Journal of chemical information and computer sciences.

[224]  G. Schneider,et al.  Scaffold‐Hopping Potential of Ligand‐Based Similarity Concepts , 2006, ChemMedChem.

[225]  Douglas R. Henry,et al.  Optimization of MDL substructure search keys for the prediction of activity and toxicity , 2005 .

[226]  Andreas Barth,et al.  Messenger and S4: A Comparison of Structure Search Systems , 1994, J. Chem. Inf. Comput. Sci..

[227]  Martin Hofmann-Apitius,et al.  Detection of IUPAC and IUPAC-like chemical names , 2008, ISMB.

[228]  Ieva O. Hartwell,et al.  An Overview of DIALOG , 1990 .

[229]  Mitsuo Sasamoto A Qualitative Camparison of Wiswesser Line Notation with Ringdoc , 1973 .

[230]  Michael F. Lynch,et al.  Computer Storage and Retrieval of Generic Chemical Structures in Patents, 16. The Refined Search: An Algorithm for Matching Components of Generic Chemical Structures at the Atom-Bond Level , 1995, J. Chem. Inf. Comput. Sci..

[231]  Johann Gasteiger,et al.  Hash codes for the identification and classification of molecular structure elements , 1994, J. Comput. Chem..

[232]  W. G. Town,et al.  Organisation of large collections of chemical structures for computer searching , 1969 .

[233]  A. Peter Johnson,et al.  Recent Advances in the CLiDE Project: Logical Layout Analysis of Chemical Documents , 1997, J. Chem. Inf. Comput. Sci..

[234]  S. Krishnan,et al.  Hash Functions for Rapid Storage and Retrieval of Chemical Structures , 1978, J. Chem. Inf. Comput. Sci..

[235]  I. Bregovec,et al.  A Guide to IUPAC Nomenclature of Organic Compounds , 2002 .

[236]  Gerald G. Vander Stouw,et al.  Chemical Abstracts Service Chemical Registry System. 10. Registration of substances from pre-1965 indexes of Chemical Abstracts , 1988, J. Chem. Inf. Comput. Sci..

[237]  Peter Willett,et al.  Chemical structure systems : computational techniques for representation, searching and processing of structural information , 1991 .

[238]  Egon L. Willighagen,et al.  Chemical Markup, XML, and the World Wide Web. 5. Applications of Chemical Metadata in RSS Aggregators , 2004, J. Chem. Inf. Model..

[239]  Peter Willett,et al.  From chemical documentation to chemoinformatics: 50 years of chemical information science , 2008, J. Inf. Sci..

[240]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[241]  Debra L Banville Mining chemical and biological information from the drug literature. , 2009, Current opinion in drug discovery & development.

[242]  Yang Liu,et al.  Route Designer: A Retrosynthetic Analysis Tool Utilizing Automated Retrosynthetic Rule Generation , 2009, J. Chem. Inf. Model..

[243]  Robert E. Stobaugh,et al.  The Chemical Abstracts Service Chemical Registry System. VII. Tautomerism and Alternating Bonds , 1980, J. Chem. Inf. Comput. Sci..

[244]  William Lingran Chen,et al.  Over 20 Years of Reaction Access Systems from MDL: A Novel Reaction Substructure Search Algorithm , 2002, J. Chem. Inf. Comput. Sci..

[245]  Arthur Dalby,et al.  Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited , 1992, J. Chem. Inf. Comput. Sci..

[246]  Henry S Rzepa,et al.  Enhancement of the chemical semantic web through the use of InChI identifiers. , 2005, Organic & biomolecular chemistry.

[247]  Yvonne C. Martin,et al.  The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding , 1997, J. Chem. Inf. Comput. Sci..

[248]  John M. Barnard,et al.  Techniques for Generating Descriptive Fingerprints in Combinatorial Libraries , 1997, J. Chem. Inf. Comput. Sci..

[249]  C. John Blankley,et al.  Comparison of 2D Fingerprint Types and Hierarchy Level Selection Methods for Structural Grouping Using Ward's Clustering , 2000, J. Chem. Inf. Comput. Sci..

[250]  J. Brecher Name=Struct: A Practical Approach to the Sorry State of Real-Life Chemical Nomenclature , 1999, J. Chem. Inf. Comput. Sci..

[251]  B. Rohde Representation and Manipulation of Stereochemistry , 2008 .

[252]  J. L. Wisniewski AUTONOM: system for computer translation of structural diagrams into IUPAC-compatible names. 1. General design , 1990, J. Chem. Inf. Comput. Sci..

[253]  F. W. Matthews,et al.  Organic Search and Display using a Connectivity Matrix Derived from Wiswesser Notation , 1967 .

[254]  Peter Willett,et al.  Use of a maximum common subgraph algorithm in the automatic identification of ostensible bond changes occurring in chemical reactions , 1981, J. Chem. Inf. Comput. Sci..

[255]  Jonathan Brecher Graphical representation standards for chemical structure diagrams (IUPAC Recommendations 2008) , 2008 .

[256]  Michael F. Lynch,et al.  The Production of Machine-Readable Descriptions of Chemical Reactions Using Wiswesser Line Notations , 1978, Journal of chemical information and computer sciences.

[257]  Wendy A. Warr,et al.  Chemical Information Management , 1992 .

[258]  Henry S. Rzepa,et al.  Chemical Markup, XML, and the World Wide Web. 4. CML Schema , 2003, J. Chem. Inf. Comput. Sci..

[259]  Brian McMahon,et al.  CIF: the computer language of crystallography. , 2002, Acta crystallographica. Section B, Structural science.

[260]  Johann Gasteiger,et al.  Chemical Information in 3D Space , 1996, J. Chem. Inf. Comput. Sci..

[261]  Ann M Richard,et al.  Chemical structure indexing of toxicity data on the internet: moving toward a flat world. , 2006, Current opinion in drug discovery & development.

[262]  G. A. Wilson,et al.  The Chemical Abstracts Service Chemical Registry System. II. Augmented Connectivity Molecular Formula , 1979, J. Chem. Inf. Comput. Sci..

[263]  Michael F. Lynch,et al.  Computer storage and retrieval of generic chemical structures in patents. 10. Assignment and logical bubble-up of ring screens for structurally explicit generics , 1989, J. Chem. Inf. Comput. Sci..

[264]  Paul Meehan,et al.  CrossFire: a structural revolution for chemists , 2001, Online Inf. Rev..

[265]  Johann Gasteiger,et al.  The central role of chemoinformatics , 2006 .

[266]  Marc Zimmermann,et al.  Über die Kunst, dem Rechner das Lesen beizubringen , 2007 .

[267]  Sigrid Rössler,et al.  The GREMAS System, an Intergral Part of the IDC System for Chemical Documentation , 1970 .

[268]  Klaus Gundertofte,et al.  A Fragment‐weighted Key‐based Similarity Measure for Use in Structural Clustering and Virtual Screening , 2006 .

[269]  Thomas E. Moock,et al.  Conformational searching in ISIS/3D databases , 1994, J. Chem. Inf. Comput. Sci..

[270]  James G. Nourse,et al.  Computer Representation and Searching of Chemical Substances , 1993 .

[271]  Frank Oellien,et al.  Enhanced CACTVS Browser of the Open NCI Database , 2002, J. Chem. Inf. Comput. Sci..

[272]  Gareth Jones,et al.  Hyperstructure model for chemical structure handling: Techniques for substructure searching , 1994, J. Chem. Inf. Comput. Sci..

[273]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[274]  Marian Z. DeBardeleben,et al.  Chemical supply catalog indexing: now and the future. An ideal place for use of the Wiswesser line notation , 1982, J. Chem. Inf. Comput. Sci..

[275]  W. L. Jorgensen,et al.  CAMEO: a program for the logical prediction of the products of organic reactions , 1990 .

[276]  Egon L. Willighagen,et al.  Chemical Markup, XML, and the World Wide Web, 7. CMLSpect, an XML Vocabulary for Spectral Data , 2007, J. Chem. Inf. Model..

[277]  Johann Gasteiger,et al.  Automatic Determination of Reaction Mappings and Reaction Center Information. 2. Validation on a Biochemical Reaction Database , 2008, J. Chem. Inf. Model..

[278]  Kazuhiro Saitou,et al.  Automated extraction of chemical structure information from digital raster images , 2009, Chemistry Central journal.

[279]  Ernst Meyer Eine topologische Kurzdarstellung chemischer Strukturformeln für die Dokumentation mit Elektronischen Rechenanlagen , 1965, Inf. Storage Retr..

[280]  Engelbert Zass,et al.  A user's view of chemical reaction information sources , 1990, J. Chem. Inf. Comput. Sci..

[281]  F. H. Allen,et al.  Cambridge Crystallographic Data Centre. II. Structural Data File , 1973 .

[282]  W. T. Wipke,et al.  Stereochemically unique naming algorithm , 1974 .

[283]  Peter Willett,et al.  Similarity-based virtual screening using 2D fingerprints. , 2006, Drug discovery today.

[284]  W. J. Wiswesser,et al.  Conversion of Wiswesser Notation to a Connectivity Matrix for Organic Compounds , 1967 .