By J.P.
<aside> đź’ˇ
Feel free to explore at your own pace! If any part gets too technical or isn’t your thing, there are summaries and interesting examples throughout. Or you can just watch the video and come back to the post if you see any interesting parts!
</aside>
Picture this: A self-driving car approaches a stop sign, but instead of recognizing it correctly, the car's AI perceives it as a speed limit sign—all because someone placed a few barely noticeable stickers on it. This unsettling scenario illustrates an adversarial attack, where subtle manipulations to input data cause artificial intelligence (AI) systems to make incorrect predictions or classifications.
To us the sign below looks like a worn out stop sign, but to a CV model it is 86% sure it is a 45 speed limit sign.


Figure 1: Example of stops sign altered to fool autonomous driving models (left) vs normal stop sign (right).
But it's not just traffic signs at risk. Adversarial attacks can also compromise facial recognition systems. For instance, adding small amounts of carefully crafted "noise" to images of faces can confuse AI models that would otherwise recognize them accurately.
Figure 2: Examples of adversarial noise added to faces, causing misclassification by facial recognition models.2
Adversarial attacks typically involve adding a small amount of calculated adversarial noise—tiny perturbations imperceptible to humans—to an image. Despite being invisible to us, this noise can deceive image classification models into making confident yet incorrect predictions.
Consider this example:
Figure 3: An image demonstrating how adversarial noise causes a model to misclassify a panda as a gibbon.