comfyui-moondream
https://github.com/shadowcz007/comfyui-moondreamshadowcz007/comfyui-moondream
moondream1を使うためのComfyUIカスタムノード
VS WD14-tagger
https://gyazo.com/131df07ef1e8c0c700c7ab67317b2860
moondream.json
$ Please write a detailed caption about this image.
code:moonlight
The image features a woman wearing a dress, standing on a set of stairs. She appears to be looking down at her feet, possibly at her shoes. The woman is positioned in the middle of the stairs, with the stairs themselves extending upward. The scene seems to be a part of a larger artwork or composition, as the woman is surrounded by various elements, including a blue sky and a building in the background.
code:WD14-tagger
souryuu_asuka_langley, 1girl, solo, long_hair, breasts, looking_at_viewer, bangs, blue_eyes, dress, hair_between_eyes, sitting, school_uniform, short_sleeves, outdoors, parted_lips, sky, day, orange_hair, blue_sky, feet_out_of_frame, stairs, tokyo-3_middle_school_uniform
WD14-taggerのキャプションをmoondreamへのプロンプトに追加すればうまく融合してくれるかと思ったけどそうでもなかったnomadoor.icon
これまたうまく行かなかったけど"画像内に女性がいれば1を、いなければ0を出力"みたいにすれば贅沢な条件分岐ができるかもしれない
MLLMを使った条件分岐
MLLM時代の到達を感じます
より汎用性が高い
Comfyui image2prompt
ComfyUI VLM nodes