Attention機構（ゼロから始める転移学習）

Attention機構として

Transformer（スライド37）

LSTMを使わずにattentionだけで翻訳

「Query, Key, Value」と「Self-attention」の2つの拡張

Transformerのencoderとdecoder

感想：Attention is All You NeedでAttentionを拡張してTransformerを提案

Attention自体は以前からある！

👉TransformerがBERTにつながる！（「BERT学」の端緒。パラダイムシフトに当たるのかも）