pytorch模型定义

pytorch模型定义的几种方式

主要包括标准继承模式和使用常用容器两种方式

1. 继承nn.Module实现模型

通过继承nn模块的Module基类，并实现初始化init()以及forward()方法，实现模型定义

import torch.nn as nn

class Model(nn.Module):
    # 模型的初始化方法
    def __init__(self):
        # 首先调用父类构造方法
        super(Model, self).__init__()
        # 定义各层模型，层与层之间的权重
        self.hidden = nn.Linear(128, 16) # 隐藏层
        self.activate = nn.ReLU()
        self.output = nn.Linear(16, 10)  # 输出层
    # 模型的计算方法  
    def forward(self, x):
        x = self.activate(self.hidden(x))
        return self.output(x)

模型的使用

test_input = torch.rand((256,128))
net = Model()
print(net(test_input))
print(net)
输出:
tensor([[ 0.2073, -0.0732,  0.1259,  ...,  0.1715,  0.1540,  0.1089],
        [ 0.2999, -0.0407,  0.1167,  ...,  0.1734,  0.2581,  0.0695],
        [ 0.3761, -0.0145,  0.1217,  ...,  0.1910,  0.2665,  0.0340],
        ...,
        [ 0.3084, -0.1336,  0.1632,  ...,  0.2174, -0.0351,  0.1205],
        [ 0.3323, -0.1693,  0.1411,  ...,  0.1700,  0.1171, -0.0442],
        [ 0.3395,  0.0367, -0.0068,  ...,  0.2706,  0.2510,  0.0017]],
       grad_fn=<AddmmBackward>)
Model(
  (hidden): Linear(in_features=128, out_features=16, bias=True)
  (activate): ReLU()
  (output): Linear(in_features=16, out_features=10, bias=True)
)

2. 快速定义模型

pytorch提供了一系列继承自nn.Module的实现类，类如Sequential等用来快速定义模型

1. nn.Sequential

A sequential container. Modules will be added to it in the order they are passed in the constructor. Alternatively, an ordered dict of modules can also be passed in.

向Sequential中传入一系列的层/其他模型（Module），按照传入的顺序或者OrderedDict中的顺序进行前向传播计算，实现模型定义

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))
print(model)
输出:
Sequential(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (relu1): ReLU()
  (conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
  (relu2): ReLU()
)

2.nn.ModuleList

ModuleList can be indexed like a regular Python list, but modules it contains are properly registered, and will be visible by all Module methods.

存储module的列表，支持列表的append，insert操作，并可通过坐标形式访问

与sequence不同点，在于ModuList就是个List，不支持forward前向传播运算
与List不同点，在于ModuList中所有模型的参数均加入到了反向传播梯度计算监控中

class MyModule(nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        # ModuleList can act as an iterable, or be indexed using ints
        for i, l in enumerate(self.linears):
            x = self.linears[i // 2](x) + l(x)
        return x
net = MyModule()
print(net)
输出:
MyModule(
  (linears): ModuleList(
    (0): Linear(in_features=10, out_features=10, bias=True)
    (1): Linear(in_features=10, out_features=10, bias=True)
    (2): Linear(in_features=10, out_features=10, bias=True)
    (3): Linear(in_features=10, out_features=10, bias=True)
    (4): Linear(in_features=10, out_features=10, bias=True)
    (5): Linear(in_features=10, out_features=10, bias=True)
    (6): Linear(in_features=10, out_features=10, bias=True)
    (7): Linear(in_features=10, out_features=10, bias=True)
    (8): Linear(in_features=10, out_features=10, bias=True)
    (9): Linear(in_features=10, out_features=10, bias=True)
  )
)

与列表存储模型不同点比较

class MyModule1(nn.Module):
    def __init__(self):
        super(MyModule1, self).__init__()
        self.linearList = [nn.Linear(i,i) for i in range(1,3)]
class MyModule2(nn.Module):
    def __init__(self):
        super(MyModule2, self).__init__()
        self.linearList = nn.ModuleList([nn.Linear(i,i) for i in range(1, 3)])
net1 = MyModule1()
net2 = MyModule2()
print('net1:')
for parameter in net1.parameters():
    print(parameter)
print('net2:')
for parameter in net2.parameters():
    print(parameter)
输出:
net1:
net2:
Parameter containing:
tensor([[0.9031]], requires_grad=True)
Parameter containing:
tensor([0.8702], requires_grad=True)
Parameter containing:
tensor([[ 0.4465, -0.2073],
        [-0.3850, -0.6185]], requires_grad=True)
Parameter containing:
tensor([-0.1432, -0.6002], requires_grad=True)

#### 3. nn.MoudleDict

类似于Modulist，只是存储形式为Dict