RUS  ENG
Full version
JOURNALS // Matematicheskaya Biologiya i Bioinformatika // Archive

Mat. Biolog. Bioinform., 2016 Volume 11, Issue 1, Pages 114–126 (Mi mbb254)

This article is cited in 3 papers

Intellectual Analisys of Data

Entropy approach to the construction of a measure of word symbolic diverseness and its application to clustering of plant genomes

Yu. G. Smetanina, M. V. Ulyanovbc, A. S. Pestovad

a Federal Research Center "Informatics and Control" of the Russian Academy of Sciences, Moscow
b V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow
c Faculty of Computational Mathematics and Cybernetics of Lomonosov Moscow State University, Moscow
d Faculty of Computer Science of Higher School of Economics, Moscow

Abstract: An approach to the information analysis is considered for the case when the information is presented by words of finite length over a finite alphabet. A method of generating a measure of symbolic diverseness of words based on peak characteristics of a shift entropy function is proposed. The shift entropy function is formally defined using a unit translation operator and the entropy of discrete distributions. A model example is presented together with some results of application of the proposed measure in the clustering of families of plants using the analysis of genome of their representatives.

Key words: shift entropy, measure of symbolic diverseness, clustering of plant genomes.

UDC: 51-76: 57.087

Received 05.04.2016, Published 25.05.2016

DOI: 10.17537/2016.11.114



© Steklov Math. Inst. of RAS, 2024