D.Potapov, “Implementation of the module for determining complex load parameters for self-adapting data containers”, Informatsionnye Tekhnologii i Vychslitel'nye Sistemy, 2019, Issue 1,Pages <nobr>87

DATA PROCESSING AND ANALYSIS

Implementation of the module for determining complex load parameters for self-adapting data containers

D.Potapov

Voronezh State University, Voronezh, Russia

Abstract: In applications with a large amount of the static data or data which is using for reading mostly cache applying improves performance greatly. To achieve maximum efficiency in an adaptive data storage implementation cache size can be changed dynamically during execution based on difference between speed of a main container and the cache, and container load. The main parameter of load is a set of requesting data, which in common case can be described as Gaussian distribution. But in a real world the container load is a set of simple loads mostly, because requests to data storage can be made by many applications or different tasks. Thus, parameters of such loads should be identified to achieve cache maximum efficiency. This paper provides implementation of the module for determining complex load parameters for self-adapting data containers results. The choice of EM modification, k-means++ initialization, and module structure brief description are also explained in this work. Clustering quality (for one and many clusters, concepts drift and time frame) and module execution time in this research are analyzed. Based on tests results, it can be said, that this module is good enough for determining complex load parameters and can be used in self-adapting data containers effectively.

Keywords: store the data, cache efficiency, optimal data storage, adaptive data container, container load, gaussian mixture model, clustering, EM, k-means.

DOI: 10.14357/20718632190108