D. M. Voynov, V. A. Kovalev, “The stability of neural networks under condition of adversarial attacks to biomedical image classification”, Journal of the Belarusian State University. Mathematics and Informatics, 2020, Volume 3,Pages <nobr>60

Theoretical foundations of computer science

The stability of neural networks under condition of adversarial attacks to biomedical image classification

D. M. Voynov^a, V. A. Kovalev^b

^a Belarusian State University, 4 Niezaliežnasci Avenue, Minsk 220030, Belarus
^b United Institute of Informatics Problems, National Academy of Sciences of Belarus, 6 Surhanava Street, Minsk 220012, Belarus

Abstract: Recently, the majority of research and development teams working in the field deep learning are concentrated on the improvement of the classification accuracy and related measures of the quality of image classification whereas the problem of adversarial attacks to deep neural networks attracts much less attention. This article is dedicated to an experimental study of the influence of various factors on the stability of convolutional neural networks under the condition of adversarial attacks to biomedical image classification. On a very extensive dataset consisted of more than 1.45 million of radiological as well as histological images we assess the efficiency of attacks performed using the projected gradient descent ($PGD$), $DeepFool$ and $Carlini - Wagner (CW)$ methods. We analyze the results of both white and black box attacks to the commonly used neural architectures such as $InceptionV3, Densenet121, ResNet50, MobileNet$ and $Xception$. The basic conclusion of this study is that in the field of biomedical image classification the problem of adversarial attack stays sharp because the methods of attacks being tested are successfully attacking the above-mentioned networks so that depending on the specific task their original classification accuracy falls down from $83-97 \%$ down to the accuracy score of $15 \%$. Also, it was found that under similar conditions the $PGD$ method is less successful in adversarial attacks comparing to the $DeepFool$ and $CW$ methods. When the original images and adversarial examples are compared using the $L_{2}$-norm, the $DeepFool$ and $CW$ methods generate the adversarial examples of similar maliciousness. In addition, in three out of four of black-box attacks, the $PGD$ method has demonstrated lower attacking efficiency.

Keywords: deep learning; adversarial attacks; biomedical images.

UDC: 004.9

DOI: 10.33581/2520-6508-2020-3-60-72