Abstract:
An important limitation of existing adversarial attacks on real-world object detectors lies in their threat model: adversarial patch-based methods often produce suspicious images while image generation approaches do not restrict the attacker's capabilities of modifying the original scene. We design a threat model where the attacker modifies individual image segments and is required to produce realistic images. We also develop and evaluate a white-box attack that utilizes generative adversarial nets and diffusion models as a generator of malicious images. Our attack is able to produce high-fidelity images as measured by the Fréchet inception distance (FID) and reduces the mAP of Faster R-CNN model by > 0.2 on Cityscapes and COCO-Stuff datasets. A PyTorch implementation of our attack is available at https://github.com/DariaShel/gan-attack.
Key words and phrases:adversarial examples, object detectors, generative adversarial networks, diffusion models.