🦊PixelDiT / PiD
🏠 | 🦊雑に学ぶComfyUI
👈 |
👉 |
参考
https://github.com/Comfy-Org/ComfyUI/pull/14103feat: Support NVIDIA PixelDiT and PiD (CORE-201)
PixelDiT
Nvidia製のピクセル拡散モデル
モデルのダウンロード
diffusion_models
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pixeldit_1300m_1024px_bf16.safetensorspixeldit_1300m_1024px_bf16.safetensors (2.6 GB)
text_encoders
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/text_encoders/gemma_2_2b_it_elm_bf16.safetensorsgemma_2_2b_it_elm_bf16.safetensors (5.23 GB)
code:text
📂ComfyUI/
└── 📂models/
├── 📂diffusion_models/
│ └── pixeldit_1300m_1024px_bf16.safetensors
└── 📂text_encoders/
└── gemma_2_2b_it_elm_bf16.safetensors
text2image
https://gyazo.com/fd716c0bb5f65a54aad3363c31da7d15
PixelDiT_text2image.json
PiD
PixelDiTを活用して、既存latent diffusion モデルのVAE decodeを肩代わりする(ついでにアップスケールする)
4ステップモデル
モデルのダウンロード
VAE / latent空間に合わせて適切なPiDを選ぶ (対応しているモデルしか使えない)
1024_to_4096 は、PiDに1024pxの画像を入れて、4096pxで出力させると上手くいくよ、という意味
SDXL用PiD
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_sdxl_1024_to_4096_4step_bf16.safetensorspid_sdxl_1024_to_4096_4step_bf16.safetensors
Qwen-Image用PiD
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_qwenimage_1024_to_4096_4step_bf16.safetensorspid_qwenimage_1024_to_4096_4step_bf16.safetensors
Flux1用PiD
e.g. Z-Image / Z-Image-Turbo
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_flux1_512_to_2048_4step_bf16.safetensorspid_flux1_512_to_2048_4step_bf16.safetensors
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_flux1_1024_to_4096_4step_bf16.safetensorspid_flux1_1024_to_4096_4step_bf16.safetensors
Flux2用PiD
e.g. Flux.2
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_flux2_512_to_2048_4step_bf16.safetensorspid_flux2_512_to_2048_4step_bf16.safetensors
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_flux2_1024_to_4096_4step_bf16.safetensorspid_flux2_1024_to_4096_4step_bf16.safetensors (不具合あり)
https://huggingface.co/Comfy-Org/PixelDiT/blob/main/diffusion_models/pid_flux2_1024_to_4096_4step_2606_bf16.safetensorspid_flux2_1024_to_4096_4step_2606_bf16.safetensors
code:models
📂ComfyUI/
└── 📂models/
└── 📂diffusion_models/
├── pid_flux1_512_to_2048_4step_bf16.safetensors
├── pid_flux1_1024_to_4096_4step_bf16.safetensors
├── pid_flux2_512_to_2048_4step_bf16.safetensors
└── pid_flux2_1024_to_4096_4step_bf16.safetensors
Z-Image-Turbo_PiD_ 4k
https://gyazo.com/be589a49f195194b86b2ccef61cdc250
Z-Image-Turbo_PiD_ 4k.json
Z-Imageのlatentをデコードせずに PiD Conditioning に接続する
PiDでの生成サイズは、モデルに合わせてZ-Imageの生成サイズから拡大させる
今回は 1024_to_4096 を使ったので、4倍に
Context Windows (Manual)ノードはいわゆるタイリング
OOMする場合、もしくは縦長・横長画像で出力が荒くなったときに使う