D. I. Samokhvalov, “Machine learning-based malicious users' detection in the VKontakte social network”, Proceedings of ISP RAS, 2020, Volume 32, Issue 3,Pages <nobr>109

This article is cited in 3 papers

Machine learning-based malicious users' detection in the VKontakte social network

D. I. Samokhvalov

National Research University Higher School of Economics

Abstract: This paper presents a machine learning-based approach for detection of malicious users in the largest Russian online social network VKontakte. An exploratory data analysis was conducted to determine the insights and anomalies in a dataset consisted of 42394 malicious and 241035 genuine accounts. Furthermore, a tool for automated collection of the information about malicious accounts in the VKontakte online social network was developed and used for the dataset collection, described in this research. A baseline feature engineering was conducted and the CatBoost classifier was used to build a classification model. The results showed that this model can identify malicious users with an overall 0.91 AUC-score validated with 4-folds cross-validation approach.

Keywords: VKontakte, malicious users, machine learning, social networks, classification models.

Language: English

DOI: 10.15514/ISPRAS-2020-32(3)-10