Arxiv_robustness | Bruno Loureiro

Neural networks are notably susceptible to adversarial attacks. Understanding which features in the training data are more susceptible and how to protect them is therefore an important endeavour. In our recent pre-print we introduce a synthetic model of structured data which captures this phenomenology, and provide an exact asymptotic solution of adversarial training in this model. In particular, we identify a generalisation vs. robustness trade-off, and propose some strategies to defend non-robust features.

Enjoy Reading This Article?