I. M. Adamovich, O. I. Volkov, “Collective entity resolution in technology of concrete historical investigation support”, Sistemy i Sredstva Inform., 2024, Volume 34, Issue 1,Pages <nobr>128

This article is cited in 1 paper

Collective entity resolution in technology of concrete historical investigation support

I. M. Adamovich, O. I. Volkov

Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119133, Russian Federation

Abstract: The article is devoted to the further development of a distributed technology of concrete historical investigation support based on the principles of crowdsourcing and focused on a wide range of users who are not professional historians and biographers. Development is carried out by including in the technology an entity resolution algorithm for nominative documents processing that performs collective resolution in which entities for matching links are determined jointly. This algorithm is a modification of the greedy agglomerative clustering algorithm. The article provides a detailed description of the approach underlying the algorithm and provides its high-level pseudocode. The analysis of its effectiveness on data with varying degrees of ambiguity of names is given and the degree of ambiguity of names of concrete historical data is estimated. The conclusion about the expediency of including the algorithm in the technology is made. The directions of further research on determining the configurable parameters of the algorithm are outlined.

Keywords: concrete historical investigation, distributed technology, entity resolution, greedy algorithm, relational similarity measure.

Received: 09.01.2024

DOI: 10.14357/08696527240111