RUS  ENG
Full version
JOURNALS // Sistemy i Sredstva Informatiki [Systems and Means of Informatics] // Archive

Sistemy i Sredstva Inform., 2023 Volume 33, Issue 2, Pages 132–141 (Mi ssi891)

Application of the CHAID algorithm in the technology of concrete historical investigation support

I. M. Adamovich, O. I. Volkov

Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119133, Russian Federation

Abstract: The article continues the series of works devoted to the technology of concrete historical investigation support. The technology is based on the principles of co-creation and crowdsourcing and is designed for a wide range of users which are not professional historians and biographers. The article is devoted to the application of the decision tree method based on the CHAID algorithm to automatically fill information gaps in the set of historical facts in order to determine potentially promising areas of research. The algorithm is described and the reliability of its results with a high proportion of missing values in the data is evaluated. The proportion of lacunas in the main sources of multiple facts is estimated and the conclusion of the applicability in principle and the effectiveness of the algorithm is made taking into account the specifics of the technology. It is also shown that the CHAID algorithm develops and supplements the means of anomalies in concrete historical data detecting existing in the technology.

Keywords: concrete historical investigation, distributed technology, CHAID algorithm, missing data, anomalies.

Received: 17.01.2023

DOI: 10.14357/08696527230213



© Steklov Math. Inst. of RAS, 2024