RUS  ENG
Full version
JOURNALS // University proceedings. Volga region. Physical and mathematical sciences // Archive

University proceedings. Volga region. Physical and mathematical sciences, 2018 Issue 1, Pages 46–59 (Mi ivpnz166)

Mathematics

An approach to improving algorithms for computing distances between DNA chains (by the example of the Needleman - Wunsch algorithm)

B. Melnikova, M. A. Treninab, A. S. Kochergina

a Russian State Social University, Moscow
b Togliatti State University, Togliatti

Abstract: Background. In practice, it is often necessary to calculate the distance between sequences of a different nature. Similar algorithms are used in bioinformatics to compare sequenced genetic chains. Because of the large dimensionality of such chains, we have to use heuristic algorithms that give approximate results. Therefore, the problem arises of estimating the quality of the metrics (distances) used, which, according to its results, one can conclude that the algorithm is applicable to various studies. Purpose of the study – improving the quality of the distance between thee long strings. Results. Therefore, the problem arises of estimating the quality of the metrics (distances) used, which, according to its results, one can conclude that the algorithm is applicable to various studies. Purpose of the study - improving the quality of the distance between thee long strings. Materials and methods. To compare the genetic chains taken from the open bank NCBI, we offer a heuristic algorithm developed on the basis of the Needleman - Wunsch algorithm. After implementing this algorithm, a special function with three parameters to the obtained metric values is applied, which are determined by the method of gradient descent. Results. A qualitative evaluation of the operation of algorithms for calculating the distance between DNA strings was obtained. An approach to the improvement of such algorithms was developed. Conclusions. It was proposed to improve the Needleman - Wunsch algorithm for comparison of string sequences, and also formulate an approach to improving other algorithms for constructing metrics on long lines.

Keywords: measure of the similarity of DNA sequences, heuristic algorithms, the Needleman - Wunsch algorithm.

UDC: 004.021; 004.023; 51-76

DOI: 10.21685/2072-3040-2018-1-4



© Steklov Math. Inst. of RAS, 2024