Robustness in Modern Models and High-Stakes Settings

Adversarial Robustness · Part 5 of 5

← How We Defend Models Against Adversarial Attacks Series Home

Robustness becomes more complicated once we leave the standard image-classification setting. Different model families fail for different reasons, and high-stakes applications care about more than top-1 accuracy. They care about calibration, detection quality, deployment shifts, asymmetric error costs, and multi-sensor consistency.

This is the point where robustness stops looking like a narrow adversarial example problem and starts looking like a systems problem.

Modern Architectures Change the Shape of Fragility

Robustness is not architecture-agnostic. Vision Transformers, prompt-tuned models, self-supervised encoders, spiking networks, and prototype-based systems all introduce different inductive biases and therefore different failure modes.

For ViTs, one of the most important lessons is that the standard training recipe does not transfer cleanly into adversarial settings. Bai et al., 2023 show that removing strong augmentation can improve adversarial robustness when training ViTs.

Robustness in ViTs requires recipe changes. The best standard recipe is not automatically the best robust recipe. Image source: (Bai et al., 2023) — **Robustness in ViTs requires recipe changes.** The best standard recipe is not automatically the best robust recipe. Image source: (Bai et al., 2023)

Prompt tuning creates a different issue. Fu et al., 2023 argue that naive adversarial training in prompt-tuned systems can produce gradient obfuscation, creating an illusion of safety rather than real robustness. The lesson is that robust training must adapt to the actual parameterization of the model.

Self-supervised pre-training changes the picture again. Gao et al., 2023 show that reconstructing clean images from doubly corrupted inputs can push an encoder toward smoother, more robust features.

MIMIR uses corruption and reconstruction to shape robust features. Robustness is influenced by pre-training objectives, not only by the final fine-tuning stage. Image source: (Gao et al., 2023) — **MIMIR uses corruption and reconstruction to shape robust features.** Robustness is influenced by pre-training objectives, not only by the final fine-tuning stage. Image source: (Gao et al., 2023)

Taken together, these examples show that “modern models” do not add one new robustness problem. They add many. Each architecture changes the interface between optimization, representation, and attack surface.

High-Stakes Domains Care About More Than Classification

Now consider autonomous driving, medical imaging, or any other domain where errors have very different costs. Here robustness is not just about whether an image classifier flips label. It is about whether a perception-and-decision stack remains stable under distribution shifts, multi-modal corruption, and task-specific failure modes.

In 3D autonomous driving perception, architecture choice already affects robustness. Zhang et al., 2023 show that voxel-based detectors are often more robust than point-based ones under point-cloud attacks.

Robustness depends on representation in 3D detection. Architecture is already part of the defense story. Image source: (Zhang et al., 2023) — **Robustness depends on representation in 3D detection.** Architecture is already part of the defense story. Image source: (Zhang et al., 2023)

For multi-sensor fusion systems, the problem is broader. Mao et al., 2023 propose robustness certification for semantic transformations such as rotation and translation in camera-LiDAR fusion.

COMMIT moves robustness toward system-level guarantees. The target is no longer one classifier, but a fused perception pipeline. Image source: (Mao et al., 2023) — **COMMIT moves robustness toward system-level guarantees.** The target is no longer one classifier, but a fused perception pipeline. Image source: (Mao et al., 2023)

Medical settings expose another axis: domain shift. A model trained in one hospital may see a noticeably different distribution in another. Weng et al., 2023 show that robustness can generalize across domains more than we might expect, but that generalization cannot be assumed.

Hospital-to-hospital shift changes the robustness question. In medical imaging, distribution shift is often as important as explicit attack design. Image source: (Weng et al., 2023) — **Hospital-to-hospital shift changes the robustness question.** In medical imaging, distribution shift is often as important as explicit attack design. Image source: (Weng et al., 2023)

Cost Matters, Not Just Error Rate

In many deployments, not all mistakes are equal. Misclassifying a malignant tumor as benign is not the same kind of error as the reverse. That motivates cost-sensitive robustness, where the defense is shaped around the most dangerous failures rather than around a flat misclassification rate.

A cost-sensitive certified radius protects against the most dangerous errors first. Safety-critical robustness must respect asymmetric consequences. Image source: (Horváth et al., 2023) — **A cost-sensitive certified radius protects against the most dangerous errors first.** Safety-critical robustness must respect asymmetric consequences. Image source: (Horváth et al., 2023)

This is a useful closing lesson for the entire series. Robustness is not just a property of a model. It is also a property of the environment, the task, and the cost structure around errors.

Where the Field Is Heading

Modern robustness research is moving toward a broader agenda:

robustness that respects architecture-specific behavior,
robustness that survives domain shift and multi-modal deployment,
and robustness that prioritizes the failures that matter most in practice.

That is why I think the right mental model is no longer “Can this classifier survive PGD?” The better question is: What does reliability require in this system, under this deployment, with these costs?

References

[1]

Bai, Y., Ding, M., Wang, Y., Zhang, Z.-M., Wang, J., & Tao, D. (2023). A Light Recipe to Train Robust Vision Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, Article 10. https://doi.org/10.1109/TPAMI.2023.3283256

[2]

Fu, Z., Yuan, X., Li, Y., Guo, Y., Wang, Y., & Zhang, Y. (2023). ADAPT to Robustify Prompt Tuning Vision Transformers. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16084–16093. https://doi.org/10.1109/CVPR52729.2023.01548

[3]

Gao, P., Wang, J., Liu, T., Yan, S., & Wang, B. (2023). MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 24908–24918. https://doi.org/10.1109/CVPR52729.2023.02390

[4]

Zhang, C.-H., Zhang, Z., Wu, S., Jiang, T.-Y., & Liu, S. (2023). A Comprehensive Study of the Robustness for LiDAR-based 3D Object Detectors against Adversarial Attacks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 21919–21929.

[5]

Mao, C., Liu, C., Yang, R., Yang, H., Singh, G., & Liu, X. (2023). COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 21789–21798.

[6]

Weng, T., Chiang, P., Wang, S., Zhang, H., & Hsieh, C. (2023). Generalizability of Adversarial Robustness Under Distribution Shifts. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 24604–24613. https://doi.org/10.1109/CVPR52729.2023.02361

Adversarial Robustness · Part 5 of 5

Robustness in Modern Models and High-Stakes Settings

← How We Defend Models Against Adversarial Attacks Series Home

Modern Architectures Change the Shape of Fragility#

High-Stakes Domains Care About More Than Classification#

Cost Matters, Not Just Error Rate#

Where the Field Is Heading#

References#

Modern Architectures Change the Shape of Fragility

High-Stakes Domains Care About More Than Classification

Cost Matters, Not Just Error Rate

Where the Field Is Heading

References