RUS  ENG
Full version
JOURNALS // Vestnik Sankt-Peterburgskogo Universiteta. Seriya 10. Prikladnaya Matematika. Informatika. Protsessy Upravleniya // Archive

Vestnik S.-Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr., 2017 Volume 13, Issue 3, Pages 313–325 (Mi vspui341)

Computer science

Textual trends detection at OK

E. A. Malyutin, D. Yu. Bugaichenko, A. N. Mishenin

St. Petersburg State University, 7–9, Universitetskaya nab., St. Petersburg, 199034, Russian Federation

Abstract: Social networks now serve not as a mere medium for entertainment, but as an information distribution channel that is replacing classical mass media. In this article we describe a scalable trend detection system implemented with the social network OK. Actors (users and communities) of social networks form a broad agenda. The content of social networks is specific: Applying standard methods of media analysis to this seems impossible. It creates a natural demand for developing and implementing textual trend detection and analysis software. There are two main approaches of trend detection in academic papers: topic modeling (and further topics evolutionary analysis) and distributive models based on frequency-like properties of distinct terms. We conducted an analysis of scientific papers using both approaches taking into account the specific features of social networks. As a result of research, it was decided to use distributive models as a base for the system development. OK is one of the largest social networks in Russia and the CIS countries. Actors generate over 100M symbols of text every day. Even basic processing is a serious technical problem. So we are forced to use Big Data approaches through the development. We introduce lambda-architecture based on three main components: The article describes in detail the architecture and technical features of each component. In conclusion we present the results of operating the system as well as discuss areas for further research and development. Refs 13. Figs 7. Table 1.

Keywords: natural language processing, trend detection, big data.

UDC: 519.688

Received: March 5, 2017
Accepted: June 8, 2017

DOI: 10.21638/11701/spbu10.2017.308



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2025