Abstract:
Many buffer overrun errors in C programs are caused by erroneous string manipulations. These can lead to denial of service, incorrect computations or even exploitable vulnerabilities. One approach to eliminate such defects in the course of program development is static analysis. Existing static analysis methods for analyzing strings either produce many false positives, miss too many errors, scale poorly, or are implemented as a part of a proprietary software. To cover a significant amount of the real program defects it is necessary to detect errors that could happen only on a particular program path and cannot be defined by a single erroneous point. Also, it is essential to find misusage of library functions and user functions. The aim of this study is to develop a detection algorithm that will cover such cases, will produce at most 40% false warnings, will be applicable to any C programs without any additional restrictions, and will scale up to millions of lines of code. We have extended our approach of symbolic execution with state merging to support string manipulations. Also we have developed a string overflow detector based on our buffer overflow approach with integer indices. The new algorithm was implemented in the Svace static analyzer. As a result, the coverage of the buffer-overflow related testcases from the Juliet test suite has increased from 15.4% to 41.5% with zero false positives. Also we have compared our Juliet results to those of the Infer static analyzer. The basic Svace version (without string support) is on par with Infer except for the flow variant of complex loops, whereas string-related buffer overflows are not detected by Infer.