Show simple item record

dc.contributor.authorWang, Z
dc.contributor.authorRuan, W
dc.date.accessioned2022-07-06T09:33:11Z
dc.date.issued2023-03-17
dc.date.updated2022-07-05T15:39:01Z
dc.description.abstractRecent research on the robustness of deep learning has shown that Vision Transformers (ViTs) surpass the Convolutional Neural Networks (CNNs) under some perturbations, e.g., natural corruption, adversarial attacks, etc. Some papers argue that the superior robustness of ViT comes from the segmentation of its input images; others say that the Multi-head Self-Attention (MSA) is the key to preserving the robustness. In this paper, we aim to introduce a principled and unified theoretical framework to investigate such an argument on ViT’s robustness. We first theoretically prove that, unlike Transformers in Natural Language Processing, ViTs are Lipschitz continuous. Then we theoretically analyze the adversarial robustness of ViTs from the perspective of the Cauchy Problem, via which we can quantify how the robustness propagates through layers. We demonstrate that the first and last layers are the critical factors to affect the robustness of ViTs. Furthermore, based on our theory, we empirically show that unlike the claims from existing research, MSA only contributes to the adversarial robustness of ViTs under weak adversarial attacks, e.g., FGSM, and surprisingly, MSA actually comprises the model’s adversarial robustness under stronger attacks, e.g., PGD attacks.en_GB
dc.description.sponsorshipEngineering and Physical Sciences Research Council (EPSRC)en_GB
dc.identifier.citationIn: Machine Learning and Knowledge Discovery in Databases; ECML PKDD 2022, edited by Massih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, and Grigorios Tsoumakas, pp. 562–577. Lecture Notes in Computer Science Volume 13715en_GB
dc.identifier.doi10.1007/978-3-031-26409-2_34
dc.identifier.grantnumberEP/R026173/1en_GB
dc.identifier.urihttp://hdl.handle.net/10871/130167
dc.language.isoenen_GB
dc.publisherSpringeren_GB
dc.rights.embargoreasonUnder embargo until 17 March 2024 in compliance with publisher policyen_GB
dc.rights© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
dc.subjectAdversarial Robustnessen_GB
dc.subjectCauchy Problemen_GB
dc.subjectVision Transformeren_GB
dc.titleUnderstanding Adversarial Robustness of Vision Transformers via Cauchy Problemen_GB
dc.typeConference paperen_GB
dc.date.available2022-07-06T09:33:11Z
exeter.locationGrenoble, France
dc.descriptionThis is the author accepted manuscript. The final version is available from Springer via the DOI in this recorden_GB
dc.descriptionECML PKDD 2022: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Grenoble, France, 19 - 23 September 2022
dc.rights.urihttp://www.rioxx.net/licenses/all-rights-reserveden_GB
dcterms.dateAccepted2022-06-14
rioxxterms.versionAMen_GB
rioxxterms.licenseref.startdate2022-06-14
rioxxterms.typeConference Paper/Proceeding/Abstracten_GB
refterms.dateFCD2022-07-05T15:39:03Z
refterms.versionFCDAM
refterms.dateFOA2024-03-17T00:00:00Z
refterms.panelBen_GB
pubs.name-of-conferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2022


Files in this item

This item appears in the following Collection(s)

Show simple item record