Adaptive LLM Routing under Budget Constraints
https://arxiv.org/abs/2508.21141
LLM routing addresses this by dynamically selecting the most suitable LLM for each query/task.
We thus propose to study LLM routing as a contextual bandit problem, enabling adaptive decision-making using bandit feedback without requiring exhaustive inference across all LLMs for all queries (in contrast to supervised routing).
shared embedding space for queries and LLMs, where query and LLM embeddings are aligned to reflect their affinity
Figure 1
このクエリはこのモデル
Preference-prior Informed Linucb fOr adaptive rouTing (PILOT), a novel extension of LinUCB
Figure 2
評価データセット 3.1
RouterBench: A Benchmark for Multi-LLM Routing System
embedding spaceの訓練に (3.1)
Chatbot Arena: Benchmarking LLMs in the Wild with Elo Ratings