Gemma3を使ってみよう！

この記事では、Gemma3というLLMを使った時の作業録と感想を記す。

Gemma3とは

Googleが開発したLLM

1B、4B、12B、27Bといったさまざまなサイズが用意されており、4B以上のモデルは画像を入力することも可能

Gemma3を動かしてみる（今回はキャプショニング）

https://scrapbox.io/files/673e8a3e004d4f98df37c354.png

今回はollama経由での利用を試す

RTX3090で、27Bの動作を確認済み

code:python

#20250811

#MiniCPM-Vの試運転スクリプト

import os

from ollama import Client

# Ollamaクライアントの初期化

client = Client()

image_model = "gemma3:27b"

# 画像を分析する関数

def explaining_image(image_path):

try:

response = client.generate(model=image_model,

prompt="Please describe the image.",

images=image_path)

return response'response'.strip()

except Exception as e:

raise e

# メイン関数

def main():

image_folder = "/path/to/img_folder" #←ここに画像フォルダのパス

image_files = f for f in os.listdir(image_folder) if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif'))

if not image_files:

print("画像ファイルが見つかりません。")

return

print("利用可能な画像ファイル:")

for i, file in enumerate(image_files, 1):

print(f"{i}. {file}")

choice = int(input("分析する画像の番号を選択してください: ")) - 1

if 0 <= choice < len(image_files):

image_path = os.path.join(image_folder, image_fileschoice)

result = explaining_image(image_path)

print(result)

else:

print("無効な選択です。")

if __name__ == "__main__":

main()

出力例:

Here's a description of the image:

Visual Description:

The image depicts a vibrant, bright red apple with large, feathered white wings extending from its sides. The apple has a green stem and a few leaves. The background is a dramatic, colorful sky with a mix of golden and dark clouds, giving the impression of either sunrise or sunset.

Overall Impression:

The image has a whimsical and surreal quality. The combination of the ordinary (an apple) with the fantastical (wings) creates a visually striking and imaginative composition. It evokes themes of flight, transformation, and perhaps a playful take on the concept of "forbidden fruit."

賢いがゆえに、余計な要素も出力しがち

（例）「以下が、この画像のキャプションです」みたいなのが頭につく

望んだ出力を得るためにはしっかりしたプロンプト設計が必要

MiniCPM-Vよりは賢い！

#Yuma_Oe