RUS  ENG
Full version
JOURNALS // Nechetkie Sistemy i Myagkie Vychisleniya // Archive

Nechetkie Sistemy i Myagkie Vychisleniya, 2021 Volume 16, Issue 2, Pages 77–95 (Mi fssc81)

A model for estimating the posting frequency in an online social media with incomplete data using objective determinants of users' behaviour

V. F. Stoliarovaa, A. Toropovab, A. L. Tulupyevba

a St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg
b St. Petersburg State University, St. Petersburg

Abstract: User profiling is related to the problem of estimation of frequency of certain user’s actions in an online social media, like posting. But due to limited resources the only information available may be imprecise information on several last episodes of posting, that can be gathered via an interview. The frequency of posting estimates with such limited data may be used in the individual risk assessment that is connected with the use of online social media, for example, in medicine or cybersecurity. In the paper the Bayes belief network (BBN) for this problem is constructed, that incorporates not only the limited data on times of several last posts in an online social media, but the objective data about the user’s profile: age, sex, and friends count. With the training dataset gathered via API VKontakte we estimated conditional probability tables for two expert BBN structures (existing reduced structure based only on dates of several last posts and novel extended structure with objective behavior determinants incorporated) and automatically learned the optimal structure for the training data. Both extended models (expert and learned) showed lower values of the information criteria (Akaike information criteria and bayesian information criteria). Then with the test dataset the classification problem of the true frequency value was assessed. All three models showed similar results based on accuracy, kappa and average accuracy characteristics. This result is related to the weak strength of arcs between frequency variable and objective behavior determinants. But nevertheless the use of such variables is important in the application in order to construct the comprehensive structure of the knowledge in the area of interest. The practical significance of the work lies in the possibility of applying the proposed model to assess the posting frequency in the online social network, in particular in the tasks of modeling risk in the field of public health and socio-cybersecurity.

Keywords: online social networks, posting frequency, Bayesian belief networks, behaviour determinants, user profiling.

UDC: 004.891, 311.2

MSC: 68T37

Received: 25.08.2021
Revised: 06.12.2021

DOI: 10.26456/fssc81



Bibliographic databases:


© Steklov Math. Inst. of RAS, 2024