Transformer

https://isobe324649.hatenablog.com/entry/2023/03/20/215000

https://gyazo.com/76f4f83f061c152b478c35ea07cd4cf7

左 encoder

翻訳元の文章を理解するのが得意な構造

のちにBERTで活用された

https://github.com/facebookresearch/xformers#transformers-key-concepts

You'll find the key repository boundaries in this illustration: a Transformer is generally made of a collection of Attention mechanism, embeddings to encode some positional information, feed-forward blocks and a residual path (typically referred to as pre- or post- layer norm).

June 08, 2022 Transformerの最前線〜畳込みニューラルネットワークの先へ〜 - Speaker Deck

via https://twitter.com/biomedicalhacks/status/1542636599502024705

https://arxiv.org/abs/2207.09238

Transformerは本文がわかりづらい

わかるように全部書いてやるという論文

MLPは前提

https://overcast.fm/+MhOrrh3D8

アーキテクチャの基本形は3つのパターン

decoder / encoder / trasnformer

encder / transformer

decoder / transfomer

GPTはこれ