Deformable Convolution

请问一下各位大佬,我的可变形卷积层的用法对不对啊,每次用可变形卷积替代原有的卷积层的时候,网络都不收敛,无法学习,准确率一直都是自己瞎猜,无法去训练,求助各路大神!!

conv4_1 = mx.symbol.Convolution(
data=pool3, kernel=(3, 3), pad=(1, 1), num_filter=512, name=“conv4_1”)
bn4_1 = mx.sym.BatchNorm(data=conv4_1, name=“bn4_1”, use_global_stats=True, fix_gamma=False)
relu4_1 = mx.symbol.Activation(data=bn4_1, act_type=“relu”, name=“relu4_1”)

可变形卷积层

conv_offset4_2 = mx.symbol.Convolution(name=‘conv_offset4_2’, data=relu4_1,
num_filter=18, pad=(1, 1),
kernel=(3, 3), stride=(1, 1))
dcn4_2 = mx.contrib.sym.DeformableConvolution(data=relu4_1, name=“dcn4_2”,
offset=conv_offset4_2, pad=(1, 1),
kernel=(3, 3), num_filter=512,
num_deformable_group=1, no_bias=True)
bn4_2 = mx.sym.BatchNorm(data=dcn4_2, name=“bn4_2”, use_global_stats=True, fix_gamma=False)
relu4_2 = mx.symbol.Activation(data=bn4_2, act_type=“relu”, name=“relu4_2”)
#卷积层
conv4_3 = mx.symbol.Convolution(
data=relu4_2, kernel=(3, 3), pad=(1, 1), num_filter=512, name=“conv4_3”)
bn4_3 = mx.sym.BatchNorm(data=conv4_3, name=“bn4_3”, use_global_stats=True, fix_gamma=False)
relu4_3 = mx.symbol.Activation(data=bn4_3, act_type=“relu”, name=“relu4_3”)
pool4 = mx.symbol.Pooling(
data=relu4_3, pool_type=“max”, kernel=(2, 2), stride=(2, 2), name=“pool4”)

请问解决了嘛?我也遇到了相同的问题,使用dcn不收敛