RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2024 Volume 15, Issue 4, Pages 97–110 (Mi ps458)

Hardware, software and distributed supercomputer systems

Building robust malware detection through Conditional Generative Adversarial Network-based data augmentation

E. Baghirov

Institute of Information Technology, Baku. Azerbaijan

Abstract: Malware detection is essential in cybersecurity, yet its accuracy is often compromised by class imbalance and limited labeled data. This study leverages Conditional Generative Adversarial Networks (cGANs) to generate synthetic malware samples, addressing these challenges by augmenting the minority class.
The cGAN model generates realistic malware samples conditioned on class labels, balancing the dataset without altering the benign class. Applied to the CICMalDroid2020 dataset, the augmented data is used to train a LightGBM model, leading to improved detection accuracy, particularly for underrepresented malware classes.
The results demonstrate the efficacy of cGANs as a robust data augmentation tool, enhancing the performance and reliability of machine learning-based malware detection systems.

Key words and phrases: malware detection, Generative Adversarial Networks, machine learning, cybersecurity, data augmentation

UDC: 519.683.1: 681.513.7
BBK: 32.813.5+32.973.1

Received: 05.12.2024
Accepted: 07.12.2024

Language: English

DOI: 10.25209/2079-3316-2024-15-4-97-110



© Steklov Math. Inst. of RAS, 2025