HumanEval - nikkie-memos

HumanEval

task_id: identifier for the data sample

prompt: input for the model containing function header and docstrings

"def return1():\n"

canonical_solution: solution for the problem in the prompt

" return 1"

test: contains function to test generated code for correctness

entry_point: entry point for test

IMO：どうやって評価しているんだろう？（文字列一致でよい？）