torch.nn.Embeddingsのpadding_idx引数

https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html

オプショナル引数

If specified, the entries at padding_idx do not contribute to the gradient;

「padding_idxの入力は勾配に寄与しない」

therefore, the embedding vector at padding_idx is not updated during training, i.e. it remains as a fixed “pad”.

「したがって、padding_idxのembedding vectorは訓練中に更新されない」

For a newly constructed Embedding, the embedding vector at padding_idx will default to all zeros, but can be updated to another value to be used as the padding vector.

「padding_idxのembedding vectorはデフォルトで全ての要素が0」

padding_idxのweightを更新する例もドキュメントにある

padding_idxがあるので、複数のテキストの長さを揃えるために、paddingで埋められる

code:python

>> import torch

>> import torch.nn as nn

>> torch.manual_seed(1)

>> # embedding = nn.Embedding(10, 3) # ドキュメントの例に合わせる

>> embedding = nn.Embedding(10, 3, padding_idx=0)

>> inputs = torch.LongTensor(0,2,0,5)

>> outputs = embedding(inputs)

>> outputs.size()

torch.Size(1, 4, 3)

>> outputs

tensor([[ 0.0000, 0.0000, 0.0000, # padding_idxなので0ベクトル

0.7626, 0.4415, -0.0091,

0.0000, 0.0000, 0.0000, # padding_idxなので0ベクトル

0.4085, 0.2579, 1.0950]], grad_fn=<EmbeddingBackward0>)