Large RNA secondary structure conservation annotation using secondary structure-based MSA

Publikace na Matematicko-fyzikální fakulta |

2016

Abstrakt

Identification of conserved regions of a set of RNA secondary structures is currently an open research problem when dealing with large RNA molecules such as ribosomal RNA. We designed and implemented a method for conservancy annotation of a set of RNA molecules using their secondary structures.

The method first converts secondary structures into linear representations, which are then forwarded into multiple sequence alignment (MSA). The resulting secondary structure-based MSA is subsequently passed into a conservancy identification procedure which uses a sliding window technique to identify conserved position in the MSA and assign them a score based on the secondary structure content of the window.

The algorithm can be used to rank overall conservancy of the structures, which generally denotes evolutionary distance, as well as to assign conservancy to individual bases to identify high- or lowconservancy regions. We tested the algorithm for correlation with evolutionary distance, where it matches the expectations.

The method is freely available as a stand-alone tool implemented in the Python programming language

Klíčová slova

bioinformatics rna secondary structure conservation