A method for accelerated free energy calculations of proteins in an extended experimental ensemble derived from the Protein Data Bank

Emanuel K. Peter, Jiřı́ Černý

Institute of Biotechnology, CAS, BIOCEV, Průmyslová 595, 252 50 Vestec, Prague West,

Czech Republic

 

 

In this talk, we present a method to simulate protein systems within an experimental ensemble derived from the Protein Data Bank (PDB). After the collection of data over a non-redundant set of over 24,000 high resolution protein X-ray structures and the analysis of the radial distribution functions (RDF) g(r), we used the corresponding potential of mean force (PMF) w(r) to accelerate MD simulations of proteins,  while the underlying forcefield is corrected at the same time. After a structural analysis of the PMF-data, where we identified collective properties of groups of aliphatic, hydrophilic and aromatic aminoacids, we validated our method in simulations of di-peptides based on dialanine and compared the results with path-sampling simulations. We found a dependency from the position of the alternating residue and the chemical configuration of the sidechain of the aminoacid next to Ala. A comparison of PMF-based simulations using different AMBER-forcefields leads to approximately identical free energy partitions independent from the choice of the forcefield. We continued with simulations of mutated penta-alanines with 5 different point-mutations, where we observed that a N-terminal mutation had the largest effect on the free energy landscape of the peptide. In an application of our approach, we applied our methodology in folding-simulations of TrpCage and observed that the PMF-based sampling significantly improves the description of structural conformers along the folding pathway of this peptide. Finally, we give a perspective on the application of g(r)-data from the PDB for the accelerated simulation of DNA-conformers and protein-DNA complexes.