Department of Computer Science, State University of New York, Stony Brook 11794.
Nucleic Acids Res 19: 353-8 (1991)
Abstract
We describe a new computer program that identifies conserved secondary
structures in aligned nucleotide sequences of related single-stranded
RNAs. The program employs a series of hash tables to identify and sort
common base paired helices that are located in identical positions in
more than one sequence. The program gives information on the total
number of base paired helices that are conserved between related
sequences and provides detailed information about common helices that
have a minimum of one or more compensating base changes. The program is
useful in the analysis of large biological sequences. We have used it to
examine the number and type of complementary segments (potential base
paired helices) that can be found in common among related random
sequences similar in base composition to 16S rRNA from Escherichia coli.
Two types of random sequences were analyzed. One set consisted of
sequences that were independent but they had the same mononucleotide
composition as the 16S rRNA. The second set contained sequences that
were 80% similar to one another. Different results were obtained in the
analysis of these two types of random sequences. When 5 sequences that
were 80% similar to one another were analyzed, significant numbers of
potential helices with two or more independent base changes were
observed. When 5 independent sequences were analyzed, no potential
helices were found in common. The results of the analyses with random
sequences were compared with the number and type of helices found in the
phylogenetic model of the secondary structure of 16S ribosomal RNA.
Many more helices are conserved among the ribosomal sequences than are
found in common among similar random sequences. In addition, conserved
helices in the 16S rRNAs are, on the average, longer than the
complementary segments that are found in comparable random sequences.
The significance of these results and their application in the analysis
of long non-ribosomal nucleotide sequences is discussed.
Mesh Headings
Unique Identifier: 91195058
Chemical Identifiers (Names)