LLaVA
https://gyazo.com/ce808632e38212b2b01050c02afda740
https://github.com/haotian-liu/LLaVAhaotian-liu/LLaVA
マルチモーダル GPT-4 レベル機能を目指して構築された大規模な言語および視覚アシスタント
https://llava-vl.github.ioProject
https://llava.hliu.ccDemo
https://github.com/haotian-liu/LLaVA/blob/main/docs/MODEL_ZOO.mdModel Zoo
CLIP ViT/L-14とVicunaを接続する
Llama 2対応
https://huggingface.co/liuhaotian/llava-llama-2-13b-chat-lightning-preview
LLaVA-1.5
https://arxiv.org/abs/2310.03744Improved Baselines with Visual Instruction Tuning
LLaVA-Plus
https://arxiv.org/abs/2311.05437LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents