Adaptive LLM Routing under Budget Constraints
LLM routing addresses this by dynamically selecting the most suitable LLM for each query/task.
We thus propose to study LLM routing as a contextual bandit problem, enabling adaptive decision-making using bandit feedback without requiring exhaustive inference across all LLMs for all queries (in contrast to supervised routing).
shared embedding space for queries and LLMs, where query and LLM embeddings are aligned to reflect their affinity
Figure 1
このクエリはこのモデル
Preference-prior Informed Linucb fOr adaptive rouTing (PILOT), a novel extension of LinUCB
Figure 2
評価データセット 3.1
embedding spaceの訓練に (3.1)