🦊Qwen-Image

🏠 | 🦊雑に学ぶComfyUI

👈 | 🦊HiDream-I1

👉 | 🦊Qwen-Image-Edit

Qwen-Imageは中国語・英語のテキストレンダリング能力に焦点を当てて開発された画像生成モデルです

テキストエンコーダにQwen2.5-VLというT5とは比べ物にならないほど性能の高いVLMを使っているため、プロンプトへの理解度・忠実度がかなり高くなっています

参考

https://docs.comfy.org/tutorials/image/qwen/qwen-imageComfyUI公式Doc

https://github.com/QwenLM/Qwen-ImageQwenLM/Qwen-Image

推奨解像度

1.5 ~ 1.8Mピクセル

1:1: 1328 x 1328

16:9: 1664 x 928

4:3: 1472 x 1104

3:2: 1584 x 1056

モデルのダウンロード

https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/diffusion_modelsqwen_image_(bf16 or fp8).safetensors

https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/text_encodersqwen_2.5_vl_7b_(fp8).safetensors

https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main/split_files/vaeqwen_image_vae.safetensors

code:models

📂ComfyUI/

└── 📂models/

├── 📂diffusion_models/

│ └── qwen_image_(bf16 or fp8).safetensors

├── 📂text_encoders/

│ └── qwen_2.5_vl_7b_(fp8).safetensors

└── 📂vae/

└── qwen_image_vae.safetensors

text2image

https://gyazo.com/c616d93666ef99e2812d3db2a40d194a

Qwen-Image_fp8.json

GGUF

カスタムノード

https://github.com/city96/ComfyUI-GGUFComfyUI-GGUF

モデルのダウンロード

https://huggingface.co/city96/Qwen-Image-gguf/tree/maincity96/Qwen-Image-gguf

https://huggingface.co/unsloth/Qwen2.5-VL-7B-Instruct-GGUF/tree/mainunsloth/Qwen2.5-VL-7B-Instruct-GGUF

code:model

📂ComfyUI/

└── 📂models/

├── 📂text_encoders/

│ └── Qwen2.5-VL-7B-Instruct-.gguf

└── 📂unet/

└── qwen-image-.gguf

https://gyazo.com/2321412b578aaf8bf1c4635c08c85d89

Qwen-Image_gguf.json

Lightning

8/4stepsで生成できるようにした蒸留モデルです

LoRAも出ているのでそちらを使います

モデルのダウンロード

https://huggingface.co/lightx2v/Qwen-Image-Lightning/blob/main/Qwen-Image-Lightning-8steps-V1.1-bf16.safetensorsQwen-Image-Lightning-8steps-V1.1-bf16.safetensors

https://huggingface.co/lightx2v/Qwen-Image-Lightning/blob/main/Qwen-Image-Lightning-4steps-V1.0-bf16.safetensorsQwen-Image-Lightning-4steps-V1.0-bf16.safetensors

https://gyazo.com/021df7ba0cdf8e6869809c633b9f7803

Qwen-Image_lighting_8steps.json

ちょっと珍しい？特徴として🦊EmptySD3Latentlmageノードでも🦊EmptyHunyuanLatentVideoノードでも潜在画像が渡せます。

wip:現時点で上記ノードの互換性によるものかQwen-Image(Wanと互換性のあるVAE)の仕様によるものか未確認

今までのモデル(Wan2.1/Wan2.2、HiDream-I1)と比べても(ComfyUIに対応するモデルで)MoEでもないモノリシックなモデルで20Bと最大のアクティブパラメータ数

その為か、HiDreamやWanで有効だったDistanceSamplerを使ってステップ数を減らしてお手軽に早くする…という手の効果が殆ど無い(Euler 20ステップとDistance 7ステップが殆ど同じ処理時間になる)

速度を上げたい場合、Qwen-Image-DistillやQwen-Image-Lightningがでてきたのでそれを使おう

ただし2025/08/10時点でDesktopの正式バージョンには対応していない、使うならベータ版の最新が必要