50 year of the Cambridge Structure Database of Organic and Organometallic Compounds

J. Hašek

Institute of Biotechnology AS ČR, Vídeňská 1083, 142 20 Praha 4
hasekjh at seznam.cz

The Cambridge Structure Database (CSD) of organic and organometallic compounds celebrates already 50 years from its foundation. The Czech and Slovak scientific community represented by the “Regional Affiliated Center” covers a yearly fee for access to the CSD for non-commercial subjects continuously from 1973. The database is necessary for deep understanding structure chemistry, solid state chemistry, material research, design of advanced materials and for any research where understanding the structure-function relations are important and also for any reliable computation simulation of structure and dynamics of real molecular systems.

The license for 2015 includes more than ¾ million of experimentally determined organic and organometallic compounds. This database is complemented by the Structure Database of Crystalline Polymer Compounds (POLYBASE-contains more than 400 polymers) and the Database of Protein-Polymer Interactions (DPPI-contains more than 2000 experimentally determined mostly PEG-protein interactions).

In the CSD, the search of structures, calculation of structure parameters, filtration and tabulation of the relevant data is done by program QUEST CSD. Detailed analysis of structure relationships, calculation of the powder diffraction records and visualization of structures in solid phase is routinely done by program MERCURY. The review of experimentally verified interactions between molecules in solid phase is provided by software ISOSTAR, the interactions between macromolecules are analyzed by SUPERSTAR. Information on the CSD system is at www.ccdc.cam.ac.uk. Any installation on computers in the Czech and Slovak republics should be done via registration into the club of the CSD users in frame of the Crystallographic Association (http://www.xray.cz/xray/csca/data/r_form_cz.htm). The instructions for installation of the CSD software can be downloaded during the registration. Help in case of some problems can be found at hasekjh at seznam.cz.

Polymer structure database is inspected by POLYBASE and the structures are analyzed best by MERCURY. The database of protein-polymer interactions is inspected simply by text editor and viewed by program PYMOL (incentive version perfectly fitting all needs is free of charge at https://www.pymol.org).   

An immense amount of inspiration how to use the CSD can be found on CCDC pages. Several thousands of papers with scientific studies based on the CSD data and produced by the CSD software sorted by the date of publication are at the address http://www.ccdc.cam.ac.uk/ ResearchAndConsultancy/CCDCResearch/CCDCPublications/Pages/CCDCPublications.aspx

alternatively, you can search for papers by journals or other criteria at https://services.ccdc.cam.ac.uk/webcite/search/.

As far as teaching, the special introduction courses to the CSD for high schools with chemical curricula and for universities can be found e.g. at the School of Chemistry of the Newcastle University at http://www.ncl.ac.uk/chemistry/outreach/resources/ccdc/

Parallel to the 50th anniversary of CCDC, the related database Protein Databank (PDB), celebrate this year 45from its foundation. It contains more than 100 thousands experimentally determined structures of  bio-macromolecules mostly by protein crystallography. Great advantage of protein crystallography is that there is no limitation on the complexity of molecular system. Nowadays, it is relatively easy to determine structure of complexes of several tens proteins and to observe intermolecular interactions in their complexity. More than 90 % of structures were determined by X-ray diffraction.  About 9 % structures were determined by NMR and less than 1 % structures by other methods.

The topic is supported by BIOCEV CZ.1.05/1.1.00/02.0109 from the ERDF, and LG14009.