RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2025 Volume 37, Issue 5, Pages 111–122 (Mi tisp1045)

Tuning LLM in secure code generation

D. S. Shaikhelislamovabc, M. S. Varetsad, A. S. Syomkina, O. Y. Rogove

a National Research University Higher School of Economics
b Moscow Institute of Physics and Technology (National Research University)
c Ivannikov Institute for System Programming of the RAS
d MIREA — Russian Technological University, Moscow
e Artificial Intelligence Research Institute

Abstract: The popularity of using LLM for code generation makes it mandatory to comprehensively verify the security and reliability of the generated code. To verify the generated code, it is suggested to use the static analyzer Svace, which checks the executable code using the built-in compiler and checks the code for weaknesses. The result of the generation is processed using Svace and receives prompts with detected warnings or errors in the code and requests corrections from LLM after generation. In addition, we fine-tune the Qwen2.5-Coder model using direct preference optimization (DPO) for error code pairs that include common syntax errors and runtime errors. This reduced the error rate, including syntactic errors and vulnerabilities, by 20%. To evaluate the models, we collected a specialized dataset from open sets for LLM evaluation, focusing on tasks in which the models generate erroneous code. The experimental results show that fine-tuning the model with a focus on code quality allows you to generate code that reduces typical errors. In this work, we combine an iterative prompting mechanism with DPO to improve the security and accuracy of LLM code generation.

Keywords: code generation; large language models; static analysis; analyzer feedback; code security; fine-tuning.

Language: English

DOI: 10.15514/ISPRAS-2025-37(5)-8



© Steklov Math. Inst. of RAS, 2025