Abstract:
RNA secondary structure prediction including pseudoknotted structures of arbitrary types is a well-known NP-hard problem of computational biology. By limiting the possible types of pseudoknots the problem can be solved in polynomial time. According to the empirical thermodynamic parameters, the formation of a stem starts to decrease free energy of the structure only after the formation of the third stack of base pairs. Thus, the short stems may be unstable and provide a limited contribution to the overall free energy of a folded RNA molecule. Therefore, detailed analysis of stems in pseudoknots could facilitate reducing pseudoknots complexity. In this paper, we show that the pseudoknots from experimentally determined RNA spatial structures are primarily formed by short stems of 2–3 base pairs. The short stems tend to avoid hairpins and prefer internal loops that indicates that they could be energetically insignificant. An exclusion of short stems reduces the diversity of pseudoknots to two basic types which are H-knots and kissing loops.
Key words:pseudoknot, short stem, RNA secondary structure, pseudoknot signature, base pair, stem, group II intron.