Visual Computing for Industry, Biomedicine, and Art

Table 3 Experimental results of ablation studies on EndoVis-18 [25] and EndoVis-17 [26] datasets

From: Dual modality prompt learning for visual question-grounded answering in robotic surgery

Models	EndoVis-18			EndoVis-17
Models	ACC	F-score	mIoU	ACC	F-score	mIoU
Co-Attn DeiT [24]	0.6136	0.3208	0.7273	0.3805	0.3026	0.6870
CAT-ViL DeiT [15]	0.6452	0.3321	0.7705	0.4491	0.3622	0.7322
GVLE-LViT [14]	0.6659	0.3614	0.7625	0.4576	0.2489	0.7275
TCP (Ours)	0.6845	0.4846	0.7762	0.4639	0.3334	0.7509
VCP (Ours)	0.6581	0.4078	0.7740	0.4915	0.3636	0.7685
DMPL (Ours)	0.6953	0.5137	0.7827	0.4957	0.3717	0.7436

Back to article page