4ab46088eec4,027
http://nhiro.org.s3.amazonaws.com/c/2/c29dff41c53d5546b7d7e111009bc026.jpg https://gyazo.com/c29dff41c53d5546b7d7e111009bc026
(OCR text)
28
TransformerDft
Output
Probabilities
Softmax
Linear
Add & Norm
Feed
Forward
Add & Norm
Add & Norm
重要な
Multi-Head
Feed
Attention
Forward
Nx
注意機構と
Add & Norm
Nx
Add &Norm
Masked
Multi-Head
Multi-Head
Attention
PEについて
Attention
解説した
Positional
Positional
Encoding
Encoding
Output
Input
Embedding
Embedding
Outputs
(shifted right)
Inputs
会場話題メモ:
PEは足し算で混ぜる
Figure 1: The Transformer - model architecture