A. V. Goloveshkin, S. S. Mikhalkovich, “Robust algorithmic binding to arbitrary fragment of program code”, Program Systems: Theory and Applications, 2022, Volume 13, Issue 1,Pages <nobr>3

Mathematical Foundations of Programming

Robust algorithmic binding to arbitrary fragment of program code

A. V. Goloveshkin, S. S. Mikhalkovich

Southern Federal University, Rostov-on-Don, Russia

Abstract: When developing a program, one of the most frequent programmer's activities is navigating the code. Solving a certain task, a programmer actively interacts with a finite set of code fragments. Preserving the information about locations of these fragments is important for quick navigation, collaboration with other developers, and using this information as a kind of documentation after some time. Integrated development environments (IDEs) provide tools for marking code fragments with labels, displaying lists of labels in the interface of an IDE, and using these labels for quick navigation. However, these tools often cannot maintain an up-to-date correspondence between the label and the marked place when the code is changed, for example, when changes are made outside of an IDE.
Earlier, the authors proposed a tool that can be integrated into multiple IDEs and used for marking large syntactic entities of a program. Marking an entity is also called "binding" to an entity. The main advantage of the approach proposed is that binding is robust to code editing. The labelled element is described as a model built on the basis of the abstract syntax tree (AST) of the program. This model is then used to algorithmically search for the marked element in the edited code. The success rate of such a search is between 99 and 100%.
The main goal of the present research is to perform robust algorithmic binding to an arbitrary part of the code. To achieve this goal, the problems of binding to a single-line code fragment (1) and to a multi-line fragment (2) are solved. To solve the problem (1), an extension of the model describing the marked fragment and an additional search algorithm are proposed. To solve the problem (2), a necessary formalization is introduced, and an algorithm for embedding nodes corresponding to labelled multi-line fragments into the AST is proposed. It is also shown that the correctness of the AST is not violated with such embedding. In the experiment, we use the proposed models and algorithms for binding to randomly selected lines in the codebases of three large projects written in C# and for subsequent algorithmic search for these lines in the edited code. Search results are analyzed manually. It is confirmed that binding is robust to code editing.

Key words and phrases: program markup, algorithmic binding to program code, software development, abstract syntax tree, code similarity measurement.

UDC: 004.4’23+004.415+004.02
BBK: 32.973.3

MSC: Primary 68U99; Secondary 68W05

Received: 28.12.2021
Accepted: 19.02.2022

DOI: 10.25209/2079-3316-2022-13-1-3-33