Bohdan
Schneider1, Zdeněk Morávek2, and Helen M. Berman3
1Center
for Complex Molecular Systems and Biomolecules and
Institute of Organic Chemistry and
Biochemistry,
Academy of Sciences of the Czech
Republic,
Flemingovo n.2, CZ-16602 Prague,
Czech Republic, bohdan@rcsb.rutgers.edu
2Faculty
of Mathematics and Physics, Charles University, Ke Karlovu 2, Prague, Czech
Republic
3Rutgers,
The State University of New Jersey, Department of Chemistry and Chemical
Biology, NJ-08854, USA
The diversity and complexity of RNA
structure is a consequence of the high flexibility of the polynucleotide
backbone and is best exemplified by the crystal structures of the ribosomal
units. A nucleotide has seven torsional degrees of freedom, including the
torsion c around the glycosidic bond; this
multidimensionality of the nucleotide conformational space represents a major
obstacle to its systematic analysis. In the work presented here, the
multidimensional RNA conformational space and the very large number of possible correlations
among the individual torsion angles were simplified by focusing on the
interrelationships of the conformation angles that define the phosphodiester
linkage (torsion angles labeled zi
and ai+1) and the other backbone torsion
angles. A single near atomic resolution structure with over 2800 nucleotides
from the 23S and 5S rRNA molecules of the large ribosome subunit (NDB code
RR0033, PDB code 1JJ2, ref. 1) serves as a database for the analysis.
Detailed analysis of the RNA
backbone torsions was performed in six three-dimensional projections of the
torsional multidimensional space. Each 3D torsional map consists of a
distribution of points (t1, t2, t3)i that are Fourier-transformed (FT) into their pseudo
electron densities, density maps visually inspected to localize peak positions
and peak maxima fitted. Before the FT averaging, the original data matrix of
2841 points had to be modified. A majority (~70%) of all nucleotides of RR0033
are in the A-type conformation with torsion angle values at the phosphodiester link zi
~290° and ai+1 ~300°. These residues were
excluded from the original data matrix because concentration of most points
into a narrow area of the map would deform the pseudo electron densities in
other regions. The remaining 830 points were Fourier-averaged and further
analyzed.
In each of the six analyzed maps,
about ten peak maxima were identified, their positions fitted and named.
Distances between the peaks and 830 individual data points (t1, t2, t3)i
of each map allow labeling of data points by the name of the nearest peak. Data
points labeled by peak names in the six maps were clustered by a technique
called “lexicographical clustering” which starts by alphabetical sorting of the
data point labels for the six maps in the same way as one would order words in
a dictionary. To make sure lexicographical clusters represent conformational families, clustered dinucleotide
fragments were compared by the standard least square overlap of dinucleotide
atoms and outliers removed;
rmsd values of the families
were 0.2 – 0.7Å. Most dinucleotides in the families are conformationally
so similar that all their torsions could be and were determined and their
averaged Cartesian coordinates determined. The identified conformations will be
characterized in the talk and the accompanying poster. Here we summarize a few
most interesting findings.
Non-A-type conformations occur in
most cases isolated between nucleotides in A-type conformations and rarely
connect to one another. Especially
several “open” conformations (numbers 8–17) occur in single stranded regions
linking two or more double helices, Some other conformations with stacked bases
(as #1, 3—6) can be a part of double helical regions with local disruption of
the helix by a bulge or non-canonical base pair(s).
Sequence
preferences were observed only in a few conformational families. Notably, the recognized
preferences involve preference for purine rich regions in conformations #4—6 (preference for RR) and #2, which occurs preferentially in tetra loops with sequences RNRN. In contrast, the
conformation with parallel orientation of the subsequent bases and zero rise
known as “adenine platform” motif (#7) showed no sequence preference for AA.
Stacked or parallel bases, ‘normal’ rise.
Conformation #1: backbone conformation is, in fact,
very close to that of the purine-pyrimidine (RY) steps of Z-DNA but in contrast
to the Z-DNA, both bases have ‘normal’ anti
orientation and the conformation shows no sequence preference for RY known from
the Z-DNA. Conformation #2: an unusual combination of torsions zi—ai+1—bi+1—gi+1 reverses the direction of the
backbone at the beginning of the second nucleotide so that the second ribose is
flipped upside down from its A-type position, the second base is rotated
anticlockwise from its A-type position by ~180°, and the bases do not stack.
The conformation has a preference for short, mostly tetra-, loops with
prevailing sequence RNRN. It is most often located at the stem—loop interface
and one nucleotide of the motif forms a non-canonical pair, typically G•A, of a
tetra loop.
Parallel
bases and low-to-zero rise.
Conformations #5 – 7: bases have low rise, are in
edge-to-edge orientation and can form non-canonical hydrogen bonds directly or via a water molecule. A significant
feature of #5 – 6 is that their dinucleotides can occur at the opposite strands
of double helices with non canonical base pairs and prefer purine rich regions,
the motif itself has mostly RR sequence. They can also occur in single stranded
links. The family #7 is very similar to the motif known as the “adenine
platform” but it shows no sequence preference for AA.
“Open
conformations”: not stacked bases, short-to-normal Pi—Pi+2
and large C1i—C1i+1 distances.
Conformations #8 – 9: the backbone forms a U-shaped turn
in the RNA direction with short Pi—Pi+2 distances. The
second base is rotated 180° away from its position in the A-type but lies in
the same plane, conformation #9 has the first base in the minor syn orientation. Dinucleotides of families
8 – 9 participate mostly in single stranded links or bulges between double
helical regions and form base pairs only rarely, never are involved in the
canonical ones. Conformations #14 – 17: are extremely extended, the bases are
rotated away from each other, the first base is ‘above’ the Pi and
the second ‘below’ the Pi+2, and border the dinucleotide on both
ends. Positions of the base and phosphate attached to the second ribose are
swapped and the backbone of the dinucleotides has an S shape form. Similarly to
other “open” conformations, #14 - 17 form hinges between a short single stranded
link and a double helix.
Conformations #19 – 32: Nucleotides in the A-type
conformations form about 70% of the studied ribosomal RNA and were further investigated.
Their large majority, exactly 1513, have the whole dinucleotide in conformation
of the canonical A-RNA. There are, however, about fifteen other well defined
conformational families with small but pronounced deviations from the canonical
A-RNA. These deviations are localized in one or two torsion angles of the first
or second nucleotide.
The present work suggests that the multidimensionality of the RNA
conformational space can be approached by analysis of conformations at the
phosphodiester link O3’i—Pi+1—O5’i+1, defined
by torsion angles zi—ai+1. We deduced the central role of
torsions zi—ai+1 from the fact that they exhibit
the highest variability yet are limited into well defined regions, noise
notwithstanding. We suggest that character and importance of the zi—ai+1 scatter gram can be compared to
the cornerstone of protein structural science, Ramachandran plot of the protein
backbone torsion angles F and Y.
1. Ban, N., Nissen, P., Hansen, J.,
Moore, P.B. and Steitz, T.A. (2000) The complete atomic structure of the large
ribosomal subunit at 2.4 Ǻ resolution. Science,
289, 905-920.