写在前面的话
本文主要是基于Pytorch给出的官方Tutorial,然后我按照自己的喜好编辑成Jupyter文档,后转成本博客,用来作为自己的日常参照材料。
Pytorch
Pytorch给我的感觉是:它是基于更高级的封装,实现深度学习更加简单,比较适合科研型的或者是实现一些想法的入门级选手。Tensorflow更适合工程项目,能够比较高效的运行。但是对于一般选手来说,Pytorch更适合,因为它是动态图,在个人代码水平不是很高的情况下,Pytorch的效率是高于Tensorflow的。
1 | from __future__ import print_function |
Tensors
1 | # Contruct a 2x1 matric, uninitialized: |
tensor([[0.0000],
[0.0000]])
1 | # Construct a randomly initialized matrix: |
tensor([[0.2854, 0.5359, 0.7811],
[0.1065, 0.0246, 0.3945],
[0.8341, 0.6808, 0.4578],
[0.4257, 0.7255, 0.3597],
[0.3510, 0.3170, 0.1526]])
1 | # Construct a matrix filled zeros and dtype long |
tensor([[0],
[0]])
1 | # Construct a tensor directly from data; |
tensor([3, 3])
1 | x = x.new_ones(2, 1, dtype=torch.double) # new_* methods take in size |
tensor([[1.],
[1.]], dtype=torch.float64)
tensor([[1.4535],
[0.0968]])
1 | # get its size |
torch.Size([2, 1])
torch.Size([2, 1])
Opertions
1 | # Addition |
tensor([[1.9447, 1.9085],
[1.3177, 1.8074],
[1.1208, 1.8663]])
tensor([[1.9447, 1.9085],
[1.3177, 1.8074],
[1.1208, 1.8663]])
tensor([[1.9447, 1.9085],
[1.3177, 1.8074],
[1.1208, 1.8663]])
tensor([[1.9447, 1.9085],
[1.3177, 1.8074],
[1.1208, 1.8663]])
注:任何tensor后面带有下划线都会改变tensor的值,比如x.copy_(y), x.t_(x)
1 | x = torch.rand(3, 2) |
tensor([[0.8368, 0.9204],
[0.3797, 0.0908],
[0.4454, 0.7684]])
tensor([0.9204, 0.0908, 0.7684])
1 | # resizing |
torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])
1 | # Converting Numpy Array to Torch Tensor |
[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
1 | # CUDA Tensor |
Define the Network
1 | import torch |
本来这个网络应该是LeNet的,输入要求是32x32,我改变了第一个全连接层,将输入变为了20x20。
所以网络结构可以通过自己的想法进行改变,最重要的是改变全连接层就可以了。
网络经过卷积之后输入的结果公式为:$$ outputsize = (inputsize - kernelsize + 2 * pad)/stride + 1 $$
1 | class Net(nn.Module): |
Net(
(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=64, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)
)
1 | # The learnable parameter of a model are returned by net.parameters() |
10
torch.Size([6, 1, 5, 5])
1 | input = torch.randn(1, 1, 20, 20) |
tensor([[0.0432, 0.1072, 0.0000, 0.1096, 0.0378, 0.0000, 0.0000, 0.0000, 0.0202,
0.0000]], grad_fn=<ReluBackward>)
zero the gradients
1 | net.zero_grad() |
torch.nn only supports mini-batches. The entire torch.nn package only supports inputs that are a mini-batch of samples, and not a single sample.
For example, nn.Conv2d will take in a 4D Tensor of nSamples x nChannels x Height x Width.
If you have a single sample, just use input.unsqueeze(0) to add a fake batch dimension.
Loss Function
1 | out = net(input) |
tensor(1.7536, grad_fn=<MseLossBackward>)
forward:
input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
-> view -> linear -> relu -> linear -> relu -> linear
-> MSELoss
-> loss
1 | print(loss.grad_fn) # MSELoss |
<MseLossBackward object at 0x0000000008619780>
<ReluBackward object at 0x0000000008619A58>
<ThAddmmBackward object at 0x0000000008619780>
Backpropagate
1 | net.zero_grad() # zeros the gradient buffer of all parameters |
conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
conv1.bias.grad before backward
tensor([0., 0., 0., 0., 0., 0.])
1 | ## Weights |
1 | import torch.optim as optim |
Training A Classifier
1 | import torch |
1 | # The output of torchvision datasets are PILImage iamges of range [0, 1]. We transform them to Tensor of normalized range [-1, 1] |
Files already downloaded and verified
Files already downloaded and verified
1 | import matplotlib.pyplot as plt |

deer truck plane horse
Define a Convolution Neural Network
1 | import torch.nn as nn |
Net(
(conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
(fc1): Linear(in_features=400, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)
)
Define a Loss function and opotimizer
1 | import torch.optim as optim |
Train the network
1 | for epoch in range(2): |
[1, 2000] loss: 2.217
[1, 4000] loss: 1.855
[1, 6000] loss: 1.672
[1, 8000] loss: 1.570
[1, 10000] loss: 1.506
[1, 12000] loss: 1.461
[2, 2000] loss: 1.370
[2, 4000] loss: 1.369
[2, 6000] loss: 1.340
[2, 8000] loss: 1.323
[2, 10000] loss: 1.276
[2, 12000] loss: 1.279
Finished Training
1 | dataiter = iter(testloader) |

GroundTruth: cat ship ship plane
1 | outputs = net(images) |
tensor([[-0.7807, -1.7329, 0.9516, 1.8043, -1.2077, 0.7349, 1.3615, -1.5412,
1.0311, -1.7234],
[ 5.2413, 6.1315, -1.7400, -3.2287, -5.0497, -5.8038, -4.2375, -4.4465,
7.7529, 4.1842],
[ 2.7686, 3.8909, -0.7050, -1.7185, -3.2625, -3.2960, -2.2888, -2.6324,
3.8761, 2.4697],
[ 4.0929, 1.8480, 0.1962, -1.7907, -1.6931, -3.4538, -2.2210, -3.0694,
4.5059, 0.7960]], grad_fn=<ThAddmmBackward>)
1 | _, predicted = torch.max(outputs, 1) |
tensor([3, 8, 1, 8])
Predicted: cat ship car ship
1 | correct = 0 |
Accuracy of the network on the 10000 test images: 54 %
1 | class_correct = list(0. for i in range(10)) |
Accuracy of plane : 55 %
Accuracy of car : 59 %
Accuracy of bird : 65 %
Accuracy of cat : 29 %
Accuracy of deer : 22 %
Accuracy of dog : 49 %
Accuracy of frog : 71 %
Accuracy of horse : 55 %
Accuracy of ship : 75 %
Accuracy of truck : 62 %