O. N. Tushkanova, “Experimental study of the numerical measures for mining associative and causal relationship in big data”, Informatsionnye Tekhnologii i Vychslitel'nye Sistemy, 2015, Issue 3,Pages <nobr>23

INFORMATION PROCESSING AND DATA ANALYSIS

Experimental study of the numerical measures for mining associative and causal relationship in big data

O. N. Tushkanova

St. Petersburg Institute for Informatics and Automation of RAS

Abstract: Big data analysis is one of the topmost problems of information technologies. In this context, associative and causal analyses are considered as perspective approaches to efficient discovering of the relationships between big data attributes. However, traditionally used causal structure discovery models are of exponential complexity. Current trend in big data causal analysis is using various measures indicating the “strength” of associations between pairs of attributes. However, data scientists have no guidance, which of them are preferable in various applications. The paper surveys the numerical measures proposed to date and conducts theoretical and experimental comparative analyses of them in order to detect those of them that best fit the basic requirements to the big data processing. The conclusions regarding the most promising measures recommended to researchers and practitioners in big data causal analysis are drawn.

Keywords: association measure, causal measure, causal analysis, big data.