Visual Computing for Industry, Biomedicine, and Art

Table 2 Evaluations of different models on the PSI-AVA [38] dataset

From: Dual modality prompt learning for visual question-grounded answering in robotic surgery

Models	PSI-AVA
Models	ACC	Precision	Recall	F-score
VisualBERT [29]	0.3008	0.1818	0.6970	0.1408
VisualBERT R [30]	0.2901	0.1927	0.6137	0.1673
CAT-ViL DeiT [15]	0.2806	0 1819	0.5967	0.1668
DeepSeek-VL [39]	0.1342	0.6342	0.3756	0.0572
ALLaVA [40]	0.1727	0.5829	0.2626	0.1177
DMPL (Ours)	0.3222	0.1552	0.7976	0.1036

Back to article page