深度卷积神经网络(AlexNet) 讨论区


#103


大概是要买新机器了。。
看了看钱包。。。大佬们。。2060显卡够用么。、


#104

跑了Alxnet这一章,居然显示内存不够,GPU是英伟达1080Ti的,这得需要多大的内存才能跑出来


#105

我遇到类似问题,后来发现CUDA 9.2是有补丁的,打了补丁之后,就没这个提示了。 我的本也是2G显卡,只好把batchsize改小,你也可以尝试一下。


#106

resize=None 报错,不知道为啥?

(mxnet_p36) ubuntu@ip-10-0-0-0:~/d2l-zh/chapter_convolutional-neural-networks$ python 1.py 
training on gpu(0)
infer_shape error. Arguments:
  data: (128, 1, 28, 28)
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 803, in _call_cached_op
    for is_arg, i in self._cached_op_args]
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 803, in <listcomp>
    for is_arg, i in self._cached_op_args]
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/parameter.py", line 494, in data
    return self._check_and_get(self._data, ctx)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/parameter.py", line 208, in _check_and_get
    "num_features, etc., for network layers."%(self.name))
mxnet.gluon.parameter.DeferredInitializationError: Parameter 'conv0_weight' has not been initialized yet because initialization was deferred. Actual initialization happens during the first forward pass. Please pass one batch of data through the network before accessing Parameters. You can also avoid deferred initialization by specifying in_units, num_features, etc., for network layers.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 789, in _deferred_infer_shape
    self.infer_shape(*args)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 862, in infer_shape
    self._infer_attrs('infer_shape', 'shape', *args)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 851, in _infer_attrs
    **{i.name: getattr(j, attr) for i, j in zip(inputs, args)})
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 996, in infer_shape
    res = self._infer_shape_impl(False, *args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1126, in _infer_shape_impl
    ctypes.byref(complete)))
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/base.py", line 252, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator pool1_fwd: [14:40:49] src/operator/nn/pooling.cc:155: Check failed: param.kernel[0] <= dshape[2] + 2 * param.pad[0] kernel size (3) exceeds input (2 padded to 2)

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3da6c2) [0x7fdac0c2b6c2]
[bt] (1) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3dac98) [0x7fdac0c2bc98]
[bt] (2) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x9bebec) [0x7fdac120fbec]
[bt] (3) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3195e3f) [0x7fdac39e6e3f]
[bt] (4) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x31989b8) [0x7fdac39e99b8]
[bt] (5) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(MXSymbolInferShape+0x1549) [0x7fdac39535d9]
[bt] (6) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fdaf8bbeec0]
[bt] (7) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fdaf8bbe87d]
[bt] (8) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce) [0x7fdaf8dd3e2e]
[bt] (9) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x12865) [0x7fdaf8dd4865]



During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "1.py", line 59, in <module>
    d2l.train_ch5(net, train_iter, test_iter, batch_size, trainer, ctx, num_epochs)
  File "../d2lzh/utils.py", line 687, in train_ch5
    y_hat = net(X)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 540, in __call__
    out = self.forward(*args)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 907, in forward
    return self._call_cached_op(x, *args)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 805, in _call_cached_op
    self._deferred_infer_shape(*args)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/gluon/block.py", line 793, in _deferred_infer_shape
    raise ValueError(error_msg)
ValueError: Deferred initialization failed because shape cannot be inferred. Error in operator pool1_fwd: [14:40:49] src/operator/nn/pooling.cc:155: Check failed: param.kernel[0] <= dshape[2] + 2 * param.pad[0] kernel size (3) exceeds input (2 padded to 2)

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3da6c2) [0x7fdac0c2b6c2]
[bt] (1) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3dac98) [0x7fdac0c2bc98]
[bt] (2) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x9bebec) [0x7fdac120fbec]
[bt] (3) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x3195e3f) [0x7fdac39e6e3f]
[bt] (4) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x31989b8) [0x7fdac39e99b8]
[bt] (5) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so(MXSymbolInferShape+0x1549) [0x7fdac39535d9]
[bt] (6) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fdaf8bbeec0]
[bt] (7) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fdaf8bbe87d]
[bt] (8) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce) [0x7fdaf8dd3e2e]
[bt] (9) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x12865) [0x7fdaf8dd4865]

#107

我4g显存都够……可能是Jupyter不能释放内存的问题。你重启以下Jupyter试试?


#108

请问这个问题解决了吗