back · main · about · writing · notes · d3 · contact

Bypassing ten adversarial detection methods (Carlini and Wagner, 2017)
20 Jan 2020 · 1030 words

Some rough notes of this paper.

Paper link: Adversarial examples are not easily detected: Bypassing ten detection methods. (Carlini and Wagner, 2017)

In brief: Adversarial defences are often flimsy. The authors are able to bypass ten detection methods for adversarial examples. They do so with both black-box and white-box attacks. The C&W attack is the main attack used. The most promising defence evaluated the classification uncertanity of each image through generating randomised models.


The authors tried three different scenarios. Each scenario depends on the knowledge of the adversary.


They used one main method of attack: the L2 based C&W attack.


Ten different detectors were tested.

Three of these detectors added a second network for detection. Three detectors relied on PCA to detect adversarial examples. Two detectors used other statistical methods to distinguish adversarial examples, comparing the distribution of natural images to the distribution of adversarial examples. Two detectors rely on input normalisation with randomisation and blurring.



Further reading

The recommended four papers by the authors for background reading (in order)

back · main · about · writing · notes · d3 · contact