RUS  ENG
Full version
JOURNALS // Prikladnaya Diskretnaya Matematika // Archive

Prikl. Diskr. Mat., 2018 Number 39, Pages 116–127 (Mi pdm610)

This article is cited in 1 paper

Computational Methods in Discrete Mathematics

Fast algorithm of cluster analysis $k$-medoids

I. N. Dmitriev

Federal Educational and Methodological Association in Information Security, Moscow, Russia

Abstract: The algorithm $k$-medoids (KM) is widely used for clustering graphs. The following modifications of it by adding some procedures have been researched in a computer experiments: CKM=KM+CLARA (Cluster LARge Application), LSKM=KM+L-SPAR (Local SPARsification), BKM=KM+CLARA+L-SPAR, NKM=KM+NH (New Heuristic choosing cluster center), CNKM=NKM+CLARA, LSNKM=NKM+L-SPAR, FKM=NKM+CLARA+L-SPAR. The experiment results show that L-SPAR procedure allows to decrease the average running time for LSKM by 5.3 %, BKM by 6.1 %, LSNKM by 8.1 %, FKM by 8.3 % and to improve the clustering quality by 2 %, 2.1 %, 3 %, 3.3 % respectively. Besides, the number of iterations in NKM is twice more than in KM, but the running time of NKM is an order less than of KM, the running time of CKM is two thirds of the running time of KM, the computer complexity of CNKM linearly depends on the graph size, and the application of CLARA does not diminish it appreciably. NH allows 16 times decreasing the computer complexity of KM without loosing the clustering quality. The experiments were conducted on data arrays that represented graphs of social networks YouTube, Live Journal, and the shop Amazon.

Keywords: fast algorithm of cluster analysis, PAM, CLARA, Local Graph Sparsification.

UDC: 519.254

DOI: 10.17223/20710410/39/11



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024