Secondary structure of prion mRNA
Secondary structure of prion mRNA
Luck R; Steger G; Riesner D
Institut fur Physikalische Biologie, Heinrich-Heine-Universitat, Dusseldorf, FRG.
J Mol Biol 258: 813-26 (1996)
An algorithm for prediction of conserved secondary structure of single-stranded RNA is presented. For each RNA of a set of homologous RNAs, optimal and suboptimal secondary structures are calculated and stored in a base-pair probability matrix. A multiple sequence alignment is performed for the set of RNAs. The resulting gaps are introduced into the individual probability matrices. These homologous probability matrices are summed to give a consensus probability matrix emphasizing the conserved secondary structure elements of the RNA set. Thus the algorithm combines the advantages of thermodynamic structure prediction by energy minimization with the information obtained from phylogenetic alignment of sequences.
The algorithm is applied to three examples. The REV-responsive element of HIV, the structure of which is well known from the literature, was chosen to test the algorithm. The second example is the 3' terminal segment of genomic single-stranded RNAs of cucumber mosaic viruses; a structure similar to that of the related brome mosaic virus was expected and was confirmed. The third example is the prion-protein mRNA from different organisms; the structure of this mRNA is not known. By application of the algorithm highly conserved hairpins were found in the prion-protein mRNA. Introduction
Abbreviations used: bp, base-pair; nt, nucleotides; ssRNA, single-stranded RNA; ORF, open reading frame; UTR, untranslated region
PrP is the major component of prions (for review see Prusiner, 1994), which cause several neurodegenerative diseases in humans and animals, including scrapie in sheep, bovine spongiform encephalopathy (BSE) in cattle, and Creutzfeldt-Jakob disease (CJD) in man. During prion infection, an abnormal isoform of PrP, in the case of scrapie designated PrP Sc , is produced from the cellular isoform of PrP C , which is encoded by a chromosomal gene, by an unknown process. In spite of intensive studies, differences between the chemical compositions of PrP C and PrP Sc could not be found (Stahl et al., 1993). Thus, a mere conformational change as the origin of the transformation of PrP C into PrP Sc is an attractive model (for review see Prusiner, 1994). It might be that putative structural elements in the PrP mRNA influence the kinetics of sequential folding of the protein during translation, and that the process of infection acts via those structural elements. A potential stemloop structure in one PrP mRNA, i.e. that of man, has been discussed (Wills & Hughes, 1990).
If, however, structural elements of the PrP mRNA are considered as functionally relevant for the development of all prion diseases, this feature must be found in every species that is susceptible to this class of disease. Therefore the new algorithm will be applied to check PrP mRNA for evolutionarily conserved secondary structures.
Prediction of structural elements in mRNA of prion protein
From the EMBL data bank, 23 PrP mRNA sequences of the following species were available: three rodents (mouse, hamster, rat), two ruminants (cattle, sheep), the human and 17 non-human primates consisting of four apes (gorilla, chim-panzee, gibbon, orangutan), seven old-world mon-keys (colobus, presbytis, baboon, mandrill, rhesus macaque, Macaca arctoides, African green monkey), and six new-world monkeys (spider, squirrel, capuchin, aotes, marmoset, titi). Calculations of individual secondary structures of these molecules with the programs RNAfold and LinAll revealed that thermodynamically stable elements of sec-ondary structure are located mainly in the 5' region including the ORF. In contrast, the 3' UTR with a lower G + C content (41% G + C versus 53% G + C in the ORF of the hamster sequence) is mainly single stranded. As examples, the calculated secondary structures of human and hamster PrP mRNAs are shown in Figure 5.
Alignment of the 23 sequences with CLUSTAL V (Higgins & Sharp, 1988; Higgins, 1994) showed considerable homology in an 1100 nt fragment (individual sequences starting between nt -42 and -10 and ending at nt 788 to 1039; numbering relative to the start of the ORF). These fragments that contain the ORF were chosen for further studies. They have a length of 735 to 795 nt and showed a high degree of homology (74 to 97%). The secondary structure distributions of the individual fragments were calculated with RNAfold at 50?C. With the alignment mentioned above, the consensus base-pair probability matrix was calculated and is presented as dot plot in Figure 6.