Analysis and acceleration of molecular simulations by time-lagged tSNE

H. Hradiská1, P. Kříž2, M. Kurečka3, J. Beránek1, G. Tedeschi1, V. Višňovský3,
A. Křenek3 and V. Spiwok1

1Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague

2Faculty of Mathematics and Physics, Charles University

3Institute of Computer Science, Masaryk Univerzity

spiwokv@vscht.cz

 

tSNE (t-distributed Stochastic Neighbour Embedding) is a popular method used to analyse data from single-cell gene expression measurements, RNAseq, flow cytometry and other experiments providing high-dimensional data. It can be also used to analyse structures sampled by molecular dynamics simulations. We developed a variant of tSNE called time-lagged tSNE. Structures sampled by molecular dynamics simulations are first superimposed to a reference structure to remove translational and rotational motions. Next, they are analysed by a variant of independent component analysis. This analysis correlates coordinates of a molecular system with time-lagged coordinates. This emphasizes slow motions and suppresses fast motions. Finally, tSNE is applied on the output.

The result is a 2D map of conformation of a molecular system. For simulations of Trp-cage mini-protein folding and unfolding we obtained a plot with a central cluster corresponding to the unfolded structure. Folded structure as well as other long-lived structures were located as peripheral clusters surrounding the unfolded state. Unlike standard tSNE, this representation captures not only structural differences between states, but also kinetics.

We see a great potential of time-lagged tSNE in acceleration of molecular simulations. We used a method called metadynamics to drive conformational changes along the 2D map from time-lagged tSNE. For this purpose it was necessary to modify time-lagged tSNE to make it possible to calculate time-lagged tSNE coordinates on the fly and to convert forces acting on time-lagged tSNE coordinates into forces acting on individual atoms. We solved this problem by an application of an artificial neural network in parametric time-lagged tSNE.

We successfully applied this method on folding of the Trp-cage mini-protein.

 

1. Spiwok V., Kříž P. Front. Mol. Biosci. 7, (2020) 132.

2. Hradiská H., Kurečka M., Beránek J., Tedeschi G., Višňovský V., Křenek A., Spiwok V. J. Phys. Chem. B 128(4), (2024), 903-913.

The work was supported by Czech Science Foundation (22-29667S).