MT-Bench
MT-bench is designed to test multi-turn conversation and instruction-following ability (2.2)
We identify 8 common categories of user prompts to guide its construction: writing, roleplay, extraction, reasoning, math, coding, knowledge I (STEM), and knowledge II (humanities/social science) (2.2)
8つのカテゴリごとに10の質問
Table 1: multi turnのQuestion
extractionがある!(宿題)
Japanese MT-Bench