Skip to main content

Table 3 Experimental results of ablation studies on EndoVis-18 [25] and EndoVis-17 [26] datasets

From: Dual modality prompt learning for visual question-grounded answering in robotic surgery

Models

EndoVis-18

EndoVis-17

ACC

F-score

mIoU

ACC

F-score

mIoU

Co-Attn DeiT [24]

0.6136

0.3208

0.7273

0.3805

0.3026

0.6870

CAT-ViL DeiT [15]

0.6452

0.3321

0.7705

0.4491

0.3622

0.7322

GVLE-LViT [14]

0.6659

0.3614

0.7625

0.4576

0.2489

0.7275

TCP (Ours)

0.6845

0.4846

0.7762

0.4639

0.3334

0.7509

VCP (Ours)

0.6581

0.4078

0.7740

0.4915

0.3636

0.7685

DMPL (Ours)

0.6953

0.5137

0.7827

0.4957

0.3717

0.7436