Skip to main content

Table 2 Evaluations of different models on the PSI-AVA [38] dataset

From: Dual modality prompt learning for visual question-grounded answering in robotic surgery

Models

PSI-AVA

ACC

Precision

Recall

F-score

VisualBERT [29]

0.3008

0.1818

0.6970

0.1408

VisualBERT R [30]

0.2901

0.1927

0.6137

0.1673

CAT-ViL DeiT [15]

0.2806

0 1819

0.5967

0.1668

DeepSeek-VL [39]

0.1342

0.6342

0.3756

0.0572

ALLaVA [40]

0.1727

0.5829

0.2626

0.1177

DMPL (Ours)

0.3222

0.1552

0.7976

0.1036