全卷积网络(FCN)讨论区

请问各位,FCN这个例子至少需要什么样的电脑配置才能跑通?
我的电脑是12G内存,GTX760,batch设到了10,只放了50张训练和标签图片进去训练,
还是出现tensor_gpu-inl.cuh:110: Check failed: err == cudaSuccess (2 vs. 0) Name: MapPlanKernel ErrStr:out of memory
1

你gpu記憶體很少( 2GB)… 感覺可能連1都放不進去
硬要跑的話把ctx = mx.cpu() 用cpu跑吧

终于试出来了,我把batch调到8就可以用gpu跑了:sleepy:

老哥,一样的问题,你现在解决了吗。。。

大家下载预训练模型需要多久?
pretrained_net = model_zoo.vision.resnet18_v2(pretrained=True)
这个我跑了好久都没有下载下来

一样的问题,老哥你解决了吗

test_images, test_labels = gb.read_voc_images(is_train=False)
代码中的read_voc_images函数中的is_train应改为train
mxnet版本为mxnet-cu91 1.2.1.post1

一样的问题,请问一下,你现在解决了这个问题了吗?

确切的说应该是使用双线性插值的转置卷积层实现了反卷积层的操作,反卷积层在教程里希望实现图像的还原(也就是说增大像素高和宽的情况下尽可能还原像素点的像素值)


看B站视频,有个疑问,为什么要加一个batch 的dimension?对axis=1做argmax其实是对axis=2上的RGB做最大值索引吗?可是之后为什么又要把batch的axis=0,channel 的axis=1拿掉啊?这样reshape之后的确变成了2D的图像信息,那不就只剩像素点了吗?求解,很疑惑这边的预测部份。

搞个"转置卷积"的概念没有意义,通过合适的排列卷积可以用矩阵乘法表示。对于图像插值,最终计算的时候也是卷积,相当于对一块区域用不同的权重卷积多次,然后按顺序排列到输出图像上。还可以理解成strides不足1个像素,stride的增量累计不到1的时候每个stride都换一个卷积核,超过1了又回到第一个卷积核。这种形式的卷积也可以转换成矩阵乘法表示。名字叫转置卷积,反而又很大的疑惑性,跟矩阵的转置并没有关系。卷积转换成矩阵乘法表示的方式,就是把输入拉平,在对应不同行的卷积行中间填充零。


报错了,Google也找不到解决办法,求帮助!!!!!!
mxnet版本是mxnet-cu90,cuda版本是9.0,mxnet能正常运行,就是这一步出错了

image
这里的loss、train acc 和test acc都代表什么意思啊,说明了什么?有大佬解释下吗,小白

在国内可以设置环境变量MXNET_GLUON_REPO为https://apache-mxnet.s3.cn-north-1.amazonaws.com.cn/来加速从Gluon下载数据集和预训练模型参数。

想问下这个报错怎么解决

书本 294
strides = s,padding=s/2,kernel_size=2*s的时候,可以将卷积核的输入的高和宽分别放大s倍

想问一下 那个 bilinear_kernel 函数和 https://mxnet.apache.org/api/python/docs/api/initializer/index.html#mxnet.initializer.Bilinear
api里面的Bilinear 有什么区别么,这个api里的这个可以用么?

想问一下 那个 bilinear_kernel 函数和 https://mxnet.apache.org/api/python/docs/api/initializer/index.html#mxnet.initializer.Bilinear
api里面的Bilinear 有什么区别么,这个api里的这个可以用么?

确实是可以的,但是没太懂 源码实现的和这里bilinear_kernel实现的区别,,有人可以讲一讲嘛?
我贴一下我两种初始化训练的结果

用的 bilinear_kernel函数初始化的,
training on gpu(0)  
epoch 1, loss 1.2221, train acc 0.748, test acc 0.811, time 25.4 sec
epoch 2, loss 0.5155, train acc 0.844, test acc 0.841, time 23.6 sec
epoch 3, loss 0.3808, train acc 0.876, test acc 0.845, time 23.3 sec
epoch 4, loss 0.3273, train acc 0.891, test acc 0.849, time 23.3 sec
epoch 5, loss 0.2993, train acc 0.899, test acc 0.848, time 23.2 sec

用 init.Bilinear() 初始化,先训练了5轮,之后在跟着训练了15轮
training on gpu(0)
epoch 1, loss 2.2494, train acc 0.724, test acc 0.729, time 28.5 sec
epoch 2, loss 1.3130, train acc 0.726, test acc 0.728, time 23.3 sec
epoch 3, loss 1.1854, train acc 0.727, test acc 0.729, time 22.9 sec
epoch 4, loss 1.1383, train acc 0.729, test acc 0.727, time 22.8 sec
epoch 5, loss 1.1108, train acc 0.728, test acc 0.729, time 22.7 sec
training on gpu(0)
epoch 1, loss 1.0861, train acc 0.728, test acc 0.728, time 23.1 sec
epoch 2, loss 1.0653, train acc 0.728, test acc 0.728, time 22.8 sec
epoch 3, loss 1.0438, train acc 0.727, test acc 0.729, time 22.9 sec
epoch 4, loss 1.0244, train acc 0.728, test acc 0.728, time 22.8 sec
epoch 5, loss 1.0175, train acc 0.728, test acc 0.730, time 23.0 sec
epoch 6, loss 0.9999, train acc 0.730, test acc 0.732, time 22.8 sec
epoch 7, loss 0.9914, train acc 0.730, test acc 0.733, time 22.8 sec
epoch 8, loss 0.9839, train acc 0.732, test acc 0.734, time 22.4 sec
epoch 9, loss 0.9611, train acc 0.735, test acc 0.735, time 22.6 sec
epoch 10, loss 0.9544, train acc 0.737, test acc 0.739, time 22.6 sec
epoch 11, loss 0.9443, train acc 0.740, test acc 0.741, time 22.7 sec
epoch 12, loss 0.9229, train acc 0.748, test acc 0.751, time 22.6 sec
epoch 13, loss 0.9105, train acc 0.756, test acc 0.753, time 22.7 sec
epoch 14, loss 0.8903, train acc 0.761, test acc 0.758, time 22.6 sec
epoch 15, loss 0.8743, train acc 0.766, test acc 0.765, time 22.7 sec

用api里的函数初始化看的出现过没有上面的好,这是为啥子啊?还是不太懂二者实现的差别所在


预测结果有明现没有 bilinear_kernel 函数初始化的好。。。好奇怪

training on gpu(0)
epoch 1, loss 1.2522, train acc 0.741, test acc 0.811, time 24.2 sec
epoch 2, loss 0.5558, train acc 0.830, test acc 0.836, time 21.3 sec
epoch 3, loss 0.4030, train acc 0.871, test acc 0.835, time 21.6 sec
epoch 4, loss 0.3434, train acc 0.885, test acc 0.851, time 21.2 sec
epoch 5, loss 0.3014, train acc 0.898, test acc 0.841, time 21.4 sec
epoch 6, loss 0.2661, train acc 0.909, test acc 0.847, time 21.4 sec
epoch 7, loss 0.2505, train acc 0.914, test acc 0.854, time 21.8 sec
epoch 8, loss 0.2287, train acc 0.921, test acc 0.856, time 21.5 sec
epoch 9, loss 0.2042, train acc 0.929, test acc 0.857, time 21.9 sec
epoch 10, loss 0.1988, train acc 0.930, test acc 0.857, time 21.7 sec
epoch 11, loss 0.1848, train acc 0.934, test acc 0.858, time 21.2 sec
epoch 12, loss 0.1857, train acc 0.934, test acc 0.859, time 21.5 sec
epoch 13, loss 0.1743, train acc 0.938, test acc 0.861, time 21.8 sec
epoch 14, loss 0.1700, train acc 0.939, test acc 0.858, time 21.8 sec
epoch 15, loss 0.1662, train acc 0.940, test acc 0.859, time 21.4 sec

重新用 bilinear_kernel初始化可以达到一个不错的准确率

还有一个练习的问题,这个好像不可以发帖超过三个,,只可以补在 这里了

想问一下 练习里面说的多尺度的那个要怎么写啊,像GoogleNet那样重新写的嘛?有点麻烦,但是直接用pretrained_net又不知道如何并联网络,,,头疼。。

multi_scale_net = nn.HybridSequential()
for layer in pretrained_net.features[:-2]:
    multi_scale_net.add(layer)
    if layer.name.startswith('resnetv20_stage'):
        base = int(layer.name[-1])   
        multi_scale_net.add(
            nn.Conv2D(num_classes, kernel_size=1),
            nn.Conv2DTranspose(num_classes, kernel_size=2**(base+2), padding=2**(base),strides=2**(base+1))
        )
        
print(multi_scale_net)  

可以看一下网络打印的结果,确实有了,但是我不知道这里如何并联,我感觉我这里好像全部串在一起了。。。有没有大佬可以指导一下呀!!!非常感谢啦

HybridSequential(
  (0): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=True, use_global_stats=False, in_channels=3)
  (1): Conv2D(3 -> 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
  (3): Activation(relu)
  (4): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(1, 1), ceil_mode=False, global_pool=False, pool_type=max, layout=NCHW)
  (5): HybridSequential(
    (0): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
      (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
      (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (1): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
      (conv1): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
      (conv2): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
  )
  (6): Conv2D(None -> 21, kernel_size=(1, 1), stride=(1, 1))
  (7): Conv2DTranspose(21 -> 0, kernel_size=(8, 8), stride=(4, 4), padding=(2, 2))
  (8): HybridSequential(
    (0): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=64)
      (conv1): Conv2D(64 -> 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=128)
      (conv2): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (downsample): Conv2D(64 -> 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
    )
    (1): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=128)
      (conv1): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=128)
      (conv2): Conv2D(128 -> 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
  )
  (9): Conv2D(None -> 21, kernel_size=(1, 1), stride=(1, 1))
  (10): Conv2DTranspose(21 -> 0, kernel_size=(16, 16), stride=(8, 8), padding=(4, 4))
  (11): HybridSequential(
    (0): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=128)
      (conv1): Conv2D(128 -> 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=256)
      (conv2): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (downsample): Conv2D(128 -> 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
    )
    (1): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=256)
      (conv1): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=256)
      (conv2): Conv2D(256 -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
  )
  (12): Conv2D(None -> 21, kernel_size=(1, 1), stride=(1, 1))
  (13): Conv2DTranspose(21 -> 0, kernel_size=(32, 32), stride=(16, 16), padding=(8, 8))
  (14): HybridSequential(
    (0): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=256)
      (conv1): Conv2D(256 -> 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=512)
      (conv2): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (downsample): Conv2D(256 -> 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
    )
    (1): BasicBlockV2(
      (bn1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=512)
      (conv1): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=512)
      (conv2): Conv2D(512 -> 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
  )
  (15): Conv2D(None -> 21, kernel_size=(1, 1), stride=(1, 1))
  (16): Conv2DTranspose(21 -> 0, kernel_size=(64, 64), stride=(32, 32), padding=(16, 16))
  (17): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=512)
  (18): Activation(relu)
)

batchsize改小确实有用 多谢呀