PyLLMCoreでBakLLaVAを動かしてみる
参考
PyLLMCoreライブラリを使ったけれど、LLaVA(VLM)は対応したばかりなので多分今後だいぶ変わるnomadoor.icon
$ python -m venv venv
$ venv/Scripts/activate
$ pip install py-llm-core
BakLLaVAは以下の場所に置く
$ C:\Users\ユーザー名\.cache\py-llm-core\models
CLIPは好きな場所に置く
コードを書く
code:main.py
from llm_core.llm import LLaVACPPModel
model = "BakLLaVA-1-Q4_K_M.gguf"
llm = LLaVACPPModel(
name=model,
llama_cpp_kwargs={
"logits_all": True,
"n_ctx": 8000,
"verbose": False,
"n_gpu_layers": 100, #: Set to 0 if you don't have a GPU "n_threads": 1, #: Set to the number of available CPU cores "clip_model_path": r"F:\BakLLaVA\model\BakLLaVA-1-clip-model.gguf",
}
)
llm.load_model()
history = [
{
'role': 'user',
'content': [
]
}
]
response = llm.ask('Please describe this image as accurately as possible', history=history)
print(response.choices0.message.content) なぜか画像をurlでしか指定できないので、python -m http.serverでローカルサーバーを作って無理やりurlで指定する
実行
$ python main.py
結果
https://gyazo.com/a7b463207dcfce68ab38bcafef0d0185
BakLLaVA
The image features a kitchen with a counter that has various items on it, including several coffee cups and bottles. There is a sink located in the middle of the counter. A large, colorful world map is displayed on the wall behind the counter, adding a unique touch to the space.
In addition to the cups and bottles, there are also some bowls placed on the counter. The kitchen appears to be well-equipped for coffee making and serving, with multiple cups and bottles available for use.
There is a brown wooden cabinet and a kitchen countertop in the picture. On the left side of the cabinet, there is a black sink and two silver water pipes. There are also two thermos and several bottles of soap on the countertop. On the right side of the cabinet, there is a coffee maker and a black electric stove. There are two cups and a black electronic thermometer on the countertop. Behind the cabinet is a blue world map, with English, Chinese, and Latin letters on it.
english_text, no_humans, window, table, scenery, paper
https://gyazo.com/d73a0b0d9a13473a8d106fe467baea07
BakLLaVA
The image features a young woman with black hair, wearing a red bow and sporting a large, red bubble in her mouth. She is also wearing an Asian-style school uniform with a tie. Her expression appears to be a mix of innocence and rebellion, as she chews on the gum with attitude.
CogVLM
A cartoon girl wearing a gray sailor suit is blowing bubble gum. She has a black hair tie and a red tie. Her eyes are red, and she is wearing a black collar. The background is white.
WD14-tagger
1girl, solo, breasts, looking_at_viewer, bangs, simple_background, shirt, black_hair, red_eyes, jewelry, school_uniform, monochrome, upper_body, earrings, serafuku, choker, sailor_collar, hair_bun, neckerchief, eyelashes, piercing, single_hair_bun, ear_piercing, red_neckerchief, spot_color, bubble_blowing, chewing_gum
LoRA学習のデータセットのキャプションをLLaVAにやらせようと思ったけれどどうだろうnomadoor.icon WD14-taggerはアニメ系のキャラにはめっぽう強いな