映像基盤モデル
Video Recognition Meta Survey
VideoCoca
Lavender
InternVideo
Foundational Models Defining a New Era in Vision: A Survey and Outlook
Awesome-LLMs-for-Video-Understanding
SAMURAI
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
LongVU