Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
https://arxiv.org/abs/2304.09842
https://proceedings.neurips.cc/paper_files/paper/2023/hash/871ed095b734818cfba48db6aeb25a62-Abstract-Conference.html
we present Chameleon, an AI system that mitigates these limitations by augmenting LLMs with plug-and-play modules for compositional reasoning.
Chameleon synthesizes programs by composing various tools (e.g., LLMs, off-the-shelf vision models, web search engines, Python functions, and heuristic-based modules) for accomplishing complex reasoning tasks.
At the heart of Chameleon is an LLM-based planner that assembles a sequence of tools to execute to generate the final response.
https://chameleon-llm.github.io/
https://github.com/lupantech/chameleon-llm
Figure 1
画像と質問文(3択)
text detector(画像の中の文章)
knowledge retriever(単語を元に知識取得)
Chain of thought
https://github.com/lupantech/chameleon-llm/blob/main/assets/showcase_scienceqa.png?raw=true
使ったデータセット(どちらも同じlupantechから)
ScienceQA Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
TabMWP Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Table 1 (Chameleonと他のtool-augmented LLMとの比較)
Tool-Augmented Language Models (2 Related Work)
例
WebGPT: Browser-assisted question-answering with human feedback
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Toolformer: Language Models Can Teach Themselves to Use Tools
Module inventory
プロンプトで性能も上がる