Abstract:
The paper addresses the task of static (without execution) comparison of binary executable files. A program and any of its procedures can be represented as a directed graph. For a program, the corresponding graph is a function (procedure) call graph, where the nodes are the functions themselves, and an edge from vertex a to b describes a call to function b from function a. For a procedure, such a graph is a control flow graph, where the vertices are basic blocks, and an edge between nodes a and b means that the commands of block b can be executed after the commands of block a. The study proposes an algorithm for comparing directed graphs, which is then applied to program comparison. The graph comparison algorithm is based on applying a node similarity function. For comparing procedure graphs, a fuzzy hash function and a cryptographic hash function are used as this similarity function. Subsequently, this method of comparing procedure graphs is used as the node similarity function for comparing program graphs. Based on the proposed algorithm, a method for program comparison has been developed, and its investigation was conducted through two experiments. In the first experiment, the method’s behavior was studied when comparing programs compiled with different optimization options (O0, O1, O2, O3, and Os). In the second experiment, the possibility of identifying effective and resilient obfuscating transformations within a previously developed model was investigated. Within the first experiment, evidence was obtained supporting the hypothesis that similarity decreases as optimization increases from O1 to O3. Within the second experiment, some previously obtained results concerning the effectiveness (ineffectiveness) and resilience (non-resilience) of obfuscating transformations were confirmed.
Keywords:graph comparison, program similarity, effectiveness and resilience of obfuscation.