Skip to main content

Table 4 Adversarial robustness measures for the transformer architecture using no regularization, dropout, and various flipover techniques respectively

From: Flipover outperforms dropout in deep learning

Method

Dropout rate

Flipover rate

\(\alpha\)

ACC Org (%)

ACC Att (%)

Vanilla

-

-

-

81.63

9.50

Dropout

0.1

-

-

82.88

9.50

Replace all

-

0.1

0.5

80.45

11.06

Replace all

-

0.2

0.5

73.88

10.24

Replace all

-

0.1

0.7

78.95

7.74

Replace all

-

0.1

0.2

80.54

8.63

Replace gradual

-

0.1–0.3

0.5

73.88

10.24

Single

-

0.3

0.5

82.26

10.24

  1. Note: Replace all means replacing all dropout layers with flipover; Replace gradual means replacing all dropout layers with flipover and gradually increasing the flipover rate for deeper layers; Single means adding a single flipover before the final head