Skip to main content

Table 4 Comparison of recognition accuracy with state-of-the-art methods on HMDB51 and UCF101 datasets

From: Fused behavior recognition model based on attention mechanism

Methods

Input modality

Pre_training

HMDB51 (%)

UCF101 (%)

HOG/HOF [1]

RGB

20.44

IDT [17]

RGB

57.2

85.9

MIFS [18]

RGB

65.1

89.1

ECO-Lite (16 frames) [12]

RGB

Kinetics

68.2

91.6

ECO (16 frames) [12]

RGB

Kinetics

68.5

92.8

ResNext-101 [19]

RGB

Kinetics

63.8

90.7

Res3D [15]

RGB

Sports-1 M

54.9

85.8

I3D [9]

RGB

Kinetics

74.5

95.4

ResNet101 [19]

RGB

Kinetics

61.7

88.9

DTTP (split 1) [20]

RGB

ImageNet

61.5

89.7

RSN [21]

RGB

55.9

87.5

Two-stream (fusion by SVM) [5]

RGB, Optical flow

ILSVRC

59.4

88.0

VGG16 + TSN [22]

RGB,Optical flow

ImageNet

67.3

92.1

ResNet34-3DRes18 (16 frames)

RGB

Kinetics

70.997

92.143

Res34-SE-IM-Net (16 frames)

RGB

Kinetics

71.85

92.196