Abstract:
A model of additional statistical experiments has been used in this work to reveal latent periodicity in biological sequences. This model which generalizes a notion of fuzzy tandem repeats (FTRs) has allowed us to propose original statistical methods for estimation of periodicity pattern in the approximate tandem repeats (ATRs). It has been shown that if indels' percentage in approximate tandem repeats is high, then for a number of cases the alignment of copies which is based on approximation of repeat’s pattern size according to this model appears to be more optimal, compared with alignment obtained by well know Tandem Repeats Finder method (TRF). Compared with existing analogs, the proposed methods have greater power. The main advantage of the proposed methods is in their applicability in practical conditions of unrepresentative sample.
Key words:latent periodicity, test-period, profile matrix, spectrum of relative amplitudes.