【Flux.1-Dev】プロンプトのみで一貫性のあるスプライトシートを生成する

アイデア元様 : https://www.reddit.com/r/StableDiffusion/comments/1fdycbp/may_be_of_interest_flux_can_generate_highly/May be of interest.. Flux can generate highly conistent controllable frames by prompting alone. No controlnet used, just words.

FLUX.1-devでは、プロンプトに "2つのフレームの画像、1つ目は◯◯、2つ目は◯◯" とするだけでスプライトシートが生成され、登場するキャラ、物体もある程度一貫性がある

FLUX.1-schnellでも問題なく同じ事が出来た。(下記のプロンプトでそのまま確認)

Wan2.1で一枚絵を出しても(不安定ではあるが)同じ事が出来たので拡散Transformerモデルならだいたい出来るのかもしれない。

SD1.5時代から一応できることにはできてたので、DiTの特徴というよりは、DiTによって大量の画像で学習できるようになったのと、テキストエンコーダの性能が上がったのが大きいのかなnomadoor.icon

なるほど、1.5時代の話は始めて知った…morisoba65536.icon

https://gyazo.com/1ecb26d7a9f2f9f558b02e91114cc692

code:prompt

filmoto, film grain, Film photograph of a beautiful woman sitting on a sofa in her room late at night with a cool expression on her face, illuminated by a desk light on a side table. A blurred night view can be seen outside the window. She is wearing a slightly oversized, finely decorated yellow knit and a long black skirt, which is beautifully enhanced by light-coloured lipstick. Her small earrings shine in the reflection. Her bobbed hair is shaggy from sleep.

The image is divided into 2 frames.

In the first frame, she is reaching down, looking down.

In the second frame, she is getting up from the sofa

スプライトシートテクニックと組み合わせるnomadoor.icon

https://gyazo.com/a72b2885009cd0b2f621468e1f8c451a

https://gyazo.com/4f8013ae84f917f5db16e1a9d22224e3

https://openart.ai/workflows/-/-/ymZAWjzCKjPTjiSf7ivmFlux_consistent_frames_with_ControlNet.json

複数の画像をグリッドに並べてimage2image(もしくはControlNet)をすることで、一貫性のある複数枚の画像を作るというテクニックがもともとあるので、上のプロンプトと組み合わせてみる

プロンプトのみだと2フレームがキレイに半分半分にならなかったり、4フレームにしようとするとそもそもダメだったりしたけれど、ControlNet Tileを組み合わせるとなかなかどうして安定する