Abstract:
The article examines machine learning methods for detecting the introduction of SQL code into the network logs using the KNIME program, based on finding patterns between incoming features and subsequent forecasting in a binary classification problem. Unlike existing works, this article examines the effectiveness of five tree-based machine learning methods. The content and sequence of work stages are presented. The highest results were shown by the Random Forest method (accuracy – 97.58%; area under the ROC curve is 0.976).
Keywords:machine learning; KNIME; classification; dataset; data selection; SQL injection; threat detection on the network; detection of suspicious patterns; protection of web applications.