RUS  ENG
Full version
JOURNALS // Computer Optics // Archive

Computer Optics, 2018 Volume 42, Issue 5, Pages 921–927 (Mi co577)

This article is cited in 22 papers

NUMERICAL METHODS AND DATA ANALYSIS

Clustering of media content from social networks using bigdata technology

I. A. Rytsareva, D. V. Kirshba, A. V. Kupriyanovba

a Samara National Research University, Moskovskoye shosse, 34, 443086, Samara, Russia
b IPSI RAS – Branch of the FSRC “Crystallography and Photonics” RAS, Molodogvardeyskaya 151, 443001, Samara, Russia

Abstract: The article deals with one of the key problems of the social network analysis – the problem of classifying accounts based on media content uploaded by users. The main difficulties are the content heterogeneity (both in format and subject) and the large volumes of data, which leads to excessive computational complexity of its processing and often to the complete inefficiency of traditional analysis methods. In the article, we discuss an approach to the clustering of media content from social networks based on textual annotation using BigData technology – a modern and efficient tool that allows to solve the problem of large data volume processing. To carry out computational experiments, a large sample of heterogeneous images (photographs, paintings, postcards, etc.) was collected from real Twitter accounts. The results confirmed the high quality of media content clustering, the average error was around 5 %.

Keywords: cluster analysis, BigData technology, text annotation, social networks, media content analysis, k-means clustering, GoogLeNet.

Received: 24.10.2018
Accepted: 30.10.2018

DOI: 10.18287/2412-6179-2018-42-5-921-927



© Steklov Math. Inst. of RAS, 2024