DNA Conformational Classes

 

Daniel Svozil


Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, 166 10 Praha

 

7753 dinucleotides from 447 high resolution DNA structures were analyzed in the torsional space of 14 conformational variables. The conformational types were determined by the following stepwise classification procedure. The first step involves the choice of  the 3D maps of torsion angles (9 different combinations of torsion angles were used), and the identification of data points aggregates (peaks) based on their density by the means of Fourier averaging. In each of  the nine analyzed maps, ~20 peaks were identified. Each peak was approximated by a sphere, and individual data points were assigned, based on their distances from peak centers, to the peaks. All data points were labeled by names of the neighbouring paks in all nine maps, and they were, in the second step, clustered by a technique called lexicographical clustering.  Lexicographical clustering creates a typical imprint for each data point; an identical (or close to identical) imprint of a group of data points then defines a cluster. Because each imprint represents a conformation near peak positions, each cluster then represents a dinucleotide conformational family.