RUS  ENG
Full version
JOURNALS // Proceedings of the Institute for System Programming of the RAS // Archive

Proceedings of ISP RAS, 2019 Volume 31, Issue 6, Pages 33–64 (Mi tisp469)

This article is cited in 2 papers

A software complex for revealing malicious behavior in untrusted binary code

A. B. Bugeryaab, V. Yu. Efimovb, I. I. Kulaginb, V. A. Padaryancb, M. A. Solovevbc, A. Yu. Tikhonovd

a Keldysh Institute of Applied Mathematics of the Russian Academy of Sciences
b Ivannikov Institute for System Programming of the RAS
c Lomonosov Moscow State University
d Bauman Moscow State Technical Univarsity

Abstract: One of the main problem of a binary code security analysis is a revealing of malicious behavior in an untrusted program. This task is hard to automate, and it requires a participation of a cybersecurity expert. Existing solutions are aimed on the analyst manual work; automation they provide does not demonstrate a system approach. In case where needed analysis tools are absent, the analyst loses the proper support and he is forced to develop tools on one's own. This greatly slows down him from obtaining the practical results. The paper presents a software complex to solve a revealing of malicious behavior problem as a whole: from creating a controlled execution environment to man guided preparing a high-level description of an analyzed algorithm. A QEMU Developer Toolkit (QDT) is introduced, offering support for the domain specific development life cycle. QDT is especially suited for QEMU virtual machine development, including specialized testing and debugging technologies and tools. A high-level hierarchical flowchart-based representation of a program algorithm is presented, as well as an algorithm for its construction. The proposed representation is based on a hypergraph and it allows both automatic and manual data flow analysis at various detail levels. The developed representation is suitable for automatic analysis algorithms implementation. An approach to improve the quality of the resulting representation of the algorithm is proposed. The approach combines individual data streams into the one that links separate logical modules of the algorithm. A test set based on real programs and model examples has been developed to evaluate the result of constructing the proposed high-level algorithm representation.

Keywords: binary code analysis, flowcharts, data flow analysis, controlled execution, domain specific development environment.

DOI: 10.15514/ISPRAS-2019-31(6)-3



© Steklov Math. Inst. of RAS, 2024