PIXART-α

https://gyazo.com/13ca968f92a0a3ef4e14b5e301d2e9bf

https://gyazo.com/237bc783faa372433924270d41ef4f30https://gyazo.com/eb401430abdf8291ac5a548477caba2dhttps://gyazo.com/c67ef72919587d5dbe2e64438b20e7f8

https://github.com/PixArt-alpha/PixArt-alphaPixArt-alpha/PixArt-alpha

Demohttps://huggingface.co/spaces/PixArt-alpha/PixArt-alpha

https://pixart-alpha.github.io/Project

https://arxiv.org/abs/2310.00426PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MSPixArt-alpha/PixArt-XL-2-1024-MS

Diffusion Transformer

Transformerベースの画像生成モデル

データセット

LLaVAを使ってキャプションをつける

LAION、Segment Anything、Internal

Internalは美的に見える画像のデータセット

https://scrapbox.io/files/651bde4b7a2be9001bcd8036.svg

学習コストとCO2排出量は正の相関にある

PIXART-αの学習コストはGigaGANの11.1%、RAPHAELと比べると0.85%で済む

https://gyazo.com/e12b3944f60dd8940f926a296cd69f61

ほへーwogikaze.icon

スクラッチで作る場合2500万枚必要

1024×1024でA100 15000GPUhours

Stable Diffusion 2を0から作るコストは16万ドルから3万ドルまで下がった？

#画像生成モデル