STRUCTURE DATABASE OF ORGANIC POLYMERS AND THEIR INTERACTIONS WITH BIOMACROMOLECULES

J. Hašek1, M. Steinhart2, T. Koval1, P. Kolenko1,3, T. Skálová1, J. Brus2, J. Dohnálek1

1Institute of Biotechnology, Academy of Sciences, Průmyslová 595, Vestec

2Institute of Macromolecular Chemistry, Academy of Sciences, Heyrovského nám.2, Praha 6

3Faculty of Nuclear Sciences and Physical Engineering CTU, Břehová 7, Praha 1

hasekjh@seznam.cz

The Cambridge Structure Database of Organic and Organo-Metalic Compounds (CSD) [1] in its 2022 version) can be searched by the keyword “polymer”. It results in a large number of organo-metalic polymers, i.e. the crystalline structures in which organic molecules are inter-connected by metal bridges to form the “infinite” 1D, 2D, or 3D networks passing through the whole crystalline blocks. These crystals are typically regular and the corresponding clear diffraction pattern allows reliable and precise structure determination required for deposition into the CSD. 

However, in the case of the classical organic polymers, the preparation of the high quality crystalline samples is extremely difficult namely because of polydispersity and extremely long times required to achieve the equilibrium state. Diffraction quality is thus often very low. The experimental structures are often inaccurate and require theoretical re-modelling. Roughly, a half of the structure determinations do not satisfy requirement for deposition in the CSD. Some synthetic or natural polymers can be found also in the Crystallography Open Database [2]. However, about half of the published structures are not present in these databases. This is the reason why the Polymer Structure Database (POLYBASE-2011) collecting all available organic polymers [3] was prepared. The new POLYBASE-2022 version will be completely re-cured now.

Hydrophilic polymers are often used as precipitants for crystallization of bio-macromolecules. It is the reason why the Database of Protein-Polymer Interactions (DPPI-2011) [3] was formed. The contemporary DPPI-2022 version contains 3667 PDB structures of bio-macromolecules (proteins and nucleic acids). The structures collected from the RCSB server [4] experimentally confirm complexation of poly(ethyleneglycol) chains (at least four monomers in length) at the protein surface.

Because many of these proteins are complexed with more polymer chains, the DPPI-2022 contains several thousand experimentally verified interactions of hydrophilic organic polymers bound directly on the surface of protein molecules. Visual inspection of the DPPI-2022 provides surprisingly high number of various types of protein-polymer interactions. Classification of these interactions is a useful background for explaining the success of poly(ethyleneglycol)-type polymers in many economically important applications in the industry, science, medicine and pharmaceutics.

The Polymer Structure Database (POLYBASE) and the Database of Protein-Polymer Interactions (DPPI) are presently updated and will be available on request in their new versions by the end of 2022.

The research was supported by the project CZ.02.1.01/0.0/0.0/15_003/0000447 from the ERDF.

1.      Groom, C. R., Bruno, I. J., Lightfoot M. P. and. Ward, S. C. Acta Cryst. (2016) B72, 171-179.
DOI: 10.1107/S2052520616003954

2.      Quirós, M., Gražulis, S., Girdzijauskaitė, S., Merkys, A. & Vaitkus, A. Journal of Cheminformatics, (2018) 10 (23), 1-17. DOI: 10.1186/s13321-018-0279-6

3.      Hašek, J., Z. Kristallogr. (2011) 28, 475-480. DOI: 10.1524/9783486991321-077

4.      wwPDB consortium Protein Data Bank: Nucleic Acids Research, (2018) 47, D520-D528. DOI: 10.1093/nar/gky949