Code Generation
Large Language Model
基盤モデル
CodeMMLU: A Multi-Task Benchmark for Assessing Code Understanding Capabilities of CodeLLMs
https://arxiv.org/abs/2410.01999
Dracarys2
Copilot Arena
https://github.com/lmarena/copilot-arena