pytorch中tensor.backward的gradient参数的意义

Let’s say your output is not a tensor with size 1. In addition, assume that there’s some function f that you’re acting on the output, so that you get something like newOutput = f(output).

You can pass a gradient grad to output.backward(grad). The idea of this is that if you’re doing backpropagation manually, and you know the gradient of the input of the next layer (f in this case), then you can pass the gradient of the input of the next layer to the previous layer that had output output.

在手动调用时,有可能这个output是整个网络的一个层,它后面还有其他层的。如果你打算直接这个output在后面层的graident,这时候就需要吧output在后面层的gradient参数传入进来做backward运算。

发表评论

邮箱地址不会被公开。 必填项已用*标注