模型构造 讨论区


#1

http://zh.diveintodeeplearning.org/chapter_deep-learning-computation/model-construction.html


于置顶 #2

#3

练习题:如果把RecMLP改成self.denses = [nn.Dense(256), nn.Dense(128), nn.Dense(64)],这时依然会给每一层命名和初始化参数吗?这样做也许没有构成网络,全部在一层里了?所以最后用for loop代替forward来实现时,计算机不知道该在执行完哪一层时嵌套下一层,也不知道什么时候退出,所以会死循环吗?


#4

请问forward里面的两个self.dense是同一层全链接层吗?还是初始化的时候会自动生成两个self.dense啊?


#5

是同一个


#6

不会死循环,但是会有其他问题,不妨动手试试


#7

谢谢,请问我想去了解它的梯度是如何计算的该怎么查看呢?既然前向传播中调用了两次同一个层,我想分别查看梯度该怎么设置啊?


#8

试了一下用for loop作为forward,代码如下:

class ZcrMLP(nn.Block):
    '''test for loop'''
    def __init__(self, **kwargs):
        super(ZcrMLP, self).__init__(**kwargs)
        with self.name_scope():
            self.denses = [nn.Dense(256), nn.Dense(128), nn.Dense(64)]
        
    def forward(self, x):
        for layer in self.denses:
            x = nd.relu(layer(x))
        return x

zcr_net = ZcrMLP()
zcr_net.initialize()
print(zcr_net)            # print nothing
y = zcr_net.forward(x)    # x is a ndarray in shape (4, 20)
# y = zcr_net(x)          # could not work, either
# print(y.shape)

结果是会爆出一个运行时错误:

RuntimeError: Parameter zcrmlp6_dense0_weight has not been initialized. Note that you should initialize 
parameters and create Trainer with Block.collect_params() instead of Block.params because the later
 does not include Parameters of nested child Blocks

详细log如下:

ZcrMLP(

)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-29-37fca0d9c4e9> in <module>()
     14 zcr_net.initialize()
     15 print(zcr_net)            # print nothing
---> 16 y = zcr_net.forward(x)    # x is a ndarray in shape (4, 20)
     17 y = zcr_net(x)
     18 # print(y.shape)

<ipython-input-29-37fca0d9c4e9> in forward(self, x)
      8     def forward(self, x):
      9         for layer in self.denses:
---> 10             x = nd.relu(layer(x))
     11         return x
     12 

~\Miniconda3\envs\gluon\lib\site-packages\mxnet\gluon\block.py in __call__(self, *args)
    285     def __call__(self, *args):
    286         """Calls forward. Only accepts positional arguments."""
--> 287         return self.forward(*args)
    288 
    289     def forward(self, *args):

~\Miniconda3\envs\gluon\lib\site-packages\mxnet\gluon\block.py in forward(self, x, *args)
    421                     return self._call_cached_op(x, *args)
    422                 try:
--> 423                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    424                 except DeferredInitializationError:
    425                     self.infer_shape(x, *args)

~\Miniconda3\envs\gluon\lib\site-packages\mxnet\gluon\block.py in <dictcomp>(.0)
    421                     return self._call_cached_op(x, *args)
    422                 try:
--> 423                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    424                 except DeferredInitializationError:
    425                     self.infer_shape(x, *args)

~\Miniconda3\envs\gluon\lib\site-packages\mxnet\gluon\parameter.py in data(self, ctx)
    342         NDArray on ctx
    343         """
--> 344         return self._check_and_get(self._data, ctx)
    345 
    346     def list_data(self):

~\Miniconda3\envs\gluon\lib\site-packages\mxnet\gluon\parameter.py in _check_and_get(self, arr_dict, ctx)
    158             "with Block.collect_params() instead of Block.params " \
    159             "because the later does not include Parameters of " \
--> 160             "nested child Blocks"%(self.name))
    161 
    162     def _load_init(self, data, ctx):

RuntimeError: Parameter zcrmlp9_dense0_weight has not been initialized. Note that you should initialize parameters and create Trainer with Block.collect_params() instead of Block.params because the later does not include Parameters of nested child Blocks

#9

self.denses = [nn.Dense(256), nn.Dense(128), nn.Dense(64)]
用这种方法 系统打出的RecMLP 的block层是空的 没有识别出来

好像是必须给每一层赋值实例变量 才可以准确识别出来

class RecMLP(nn.Block):
def __init__(self, **kwargs):
    super(RecMLP, self).__init__(**kwargs)
    #self.net = nn.Sequential()
    with self.name_scope():
        #self.net.add(nn.Dense(256, activation="relu"))
        #self.net.add(nn.Dense(128, activation="relu"))
        self.dense0 = nn.Dense(256)
        self.dense1 = nn.Dense(128)
        self.dense2 = nn.Dense(64)
        self.denses = [self.dense0, self.dense1, self.dense2]
        
def forward(self, x):
    for d in self.denses:
        x = d(x)
    return x

rec_mlp = nn.Sequential()
rec_mlp.add(RecMLP())
rec_mlp.add(nn.Dense(10))
print(rec_mlp)
rec_mlp.initialize()
rec_mlp(x)


#10

是因为层级结构的问题吗?原来的Dense(256)&Dense(128)在net下,而Dense(64)和net是同一级的,用for loop实现的话他们三个都是在同一层里面了
示例:
(net): Sequential(
(0): Dense(256, Activation(relu))
(1): Dense(128, Activation(relu))
)
(dense): Dense(64, linear)
for loop:
(0): RecMLP(
(net): Sequential(
(0): Dense(256, Activation(relu))
(1): Dense(128, Activation(relu))
(2): Dense(64, linear)
)
)


#11

看源码后发现原因为: [nn.Dense(256), nn.Dense(128), nn.Dense(64)] 的 type 是 list, 而不是 Block, 这样就不会被自动注册到 Block 类的 self._children 属性, 导致 initialize 时在 self._children 找不到神经元, 无法初始化参数.

  • 当执行 self.xxx = yyy 时, __setattr__ 方法会检测 yyy 是否为 Block 类型, 如果是则添加到 self._children 列表中.
  • 当执行 initialize() 时, 会从 self._children 中找神经元.

详情见源码 Block 类的 __setattr__initialize 方法:
https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/block.py


#12

谢谢老哥:rofl:


#13

可以print 出网络,只包含一个全连接层


#14

我也发现这个问题了,谁能给解释下呢。


#15

root的解释挺具体的


#16
class FancyMLP(nn.Block):
def __init__(self, **kwargs):
    super(FancyMLP, self).__init__(**kwargs)
    with self.name_scope():
        self.dense = nn.Dense(256)
        self.weight = nd.random_uniform(shape=(256,20))

def forward(self, x):
    x = nd.relu(self.dense(x))
    x = nd.relu(nd.dot(x, self.weight)+1)
    x = nd.relu(self.dense(x))
    return x

请问为什么forward函数里面x = nd.relu(nd.dot(x, self.weight)+1),在算矩阵相乘后还需要加一呢?


#17

一般是加偏置值b,这个地方只是要说明怎样创建模型,不关心模型细节。


#18

class RecMLP(nn.Block):
def init(self, **kwargs):
super(RecMLP, self).init(**kwargs)
self.net = nn.Sequential()
with self.name_scope():
self.net.add(nn.Dense(256, activation=“relu”))
self.net.add(nn.Dense(128, activation=“relu”))
self.dense0 = nn.Dense(256)
self.dense1 = nn.Dense(128)
self.dense2 = nn.Dense(64)
self.denses = [self.dense0, self.dense1, self.dense2]
# self.denses = [nn.Dense(256), nn.Dense(128), nn.Dense(64)]

def forward(self, x):
    x = self.net(x)
    for layer in self.denses:
        x = nd.relu(layer(x))
    return x

rec_mlp = nn.Sequential()
rec_mlp.add(RecMLP())
rec_mlp.add(nn.Dense(10))
print(rec_mlp)

网络结构:
Sequential(
(0): RecMLP(
(dense1): Dense(128, linear)
(dense0): Dense(256, linear)
(dense2): Dense(64, linear)
(net): Sequential(
(0): Dense(256, Activation(relu))
(1): Dense(128, Activation(relu))
)
)
(1): Dense(10, linear)
)

可以看到self.dense0, self.dense1,self.dense2按照定义应该位于sequential后面,但是却位于了其前面
且self.dense0, self.dense1,self.dense2的位置也不是顺序,可以看到self.dense0变成了第二层


#19

在Sequential类里面,要显示的注册每一层,才能保证顺序。
所有的层都用一个list保存

def add(self, *blocks):
""“Adds block on top of the stack.”""
for block in blocks:
self.register_child(block)


#20

我这样定义网络,结果报错, 不知道为什么?求助

encoder = Sequential()
with encoder.name_scope():
    encoder.add(Dense(50, activation='relu'))
    encoder.add(Dense(25, activation='relu'))
    encoder.add(Dense(5, activation='relu'))


decoder = Sequential()
with decoder.name_scope():
    decoder.add(Dense(25, activation='relu'))
    decoder.add(Dense(50, activation='relu'))
    decoder.add(Dense(40))


auto_encoder = Sequential()
with auto_encoder.name_scope():
    auto_encoder.add(encoder())
    auto_encoder.add(decoder())

auto_encoder =auto_encoder()
auto_encoder.initialize()

报错信息为:

Traceback (most recent call last):
  File "W:/Nutstore_Lab/11-28/Codes/test2.py", line 47, in <module>
    auto_encoder.add(encoder())
  File "S:\Python36\lib\site-packages\mxnet\gluon\block.py", line 290, in __call__
    return self.forward(*args)
TypeError: forward() missing 1 required positional argument: 'x'

请问这是什么问题?