LLaVA
https://gyazo.com/ce808632e38212b2b01050c02afda740
https://github.com/haotian-liu/LLaVA
haotian-liu/LLaVA
マルチモーダル
GPT-4 レベル機能を目指して構築された大規模な言語および視覚アシスタント
https://llava-vl.github.io
Project
https://llava.hliu.cc
Demo
https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.md
Model Zoo
CLIP ViT/L-14
と
Vicuna
を接続する
Llama 2
対応
https://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview
LLaVA-1.5
https://arxiv.org/abs/2310.03744
Improved Baselines with Visual Instruction Tuning
LLaVA-Plus
https://arxiv.org/abs/2311.05437
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents