RUS  ENG
Full version
JOURNALS // Artificial Intelligence and Decision Making // Archive

Artificial Intelligence and Decision Making, 2023 Issue 4, Pages 49–57 (Mi iipr47)

This article is cited in 1 paper

Computational intelligence

Modified nonparametric algorithm for automatic classification of large-volume statistical data and its application

V. P. Tuboltseva, A. V. Lapkoab, À. L. Vasilyab

a M. F. Reshetnev Siberian State University of Science and Technologies, Krasnoyarsk, Russia
b Institute of Computational Modelling, Siberian Branch of the Russian Academy of Sciences, Krasnoyarsk, Russia

Abstract: A modified nonparametric algorithm for automatic classification of large-volume statistical data is proposed. Its application makes it possible to detect classes corresponding to unimodal fragments of the probability density of a multidimensional random variable. The compression of the initial information is carried out on the basis of the decomposition of the multidimensional space of features into a data array composed of the centers of the sampling intervals and the corresponding frequencies of belonging to the values of the random variable. Based on these data, a regression estimate of the probability density is synthesized. The information obtained is the basis for the algorithmization of the automatic classification procedure. A class is a compact group of observations of a random variable corresponding to a single-modal fragment of probability density. The computational efficiency of the modified nonparametric algorithm for automatic classification of large-volume statistical data is provided by the compression procedure of the source data, improvement and algorithmization of the traditional nonparametric method of class detection. The computational efficiency of the modified non-parametric algorithm for automatic classification of large volume statistical data is provided by the initial data compression procedure, improvement and algorithmization of the traditional nonparametric method for detecting compact groups of observations of a random variable. The effectiveness of the developed method of automatic classification is confirmed by the results of its application in the analysis of remote sensing data of forests damaged by the Siberian silkworm.

Keywords: nonparametric algorithm for automatic classification, regression estimation of probability density, discretization of the range of random variables, woodlands, remote sensing data.

DOI: 10.14357/20718594230405



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024