LLM-JP-Eval
NLI (Natural Language Inference)
Jamp
出処:https://github.com/tomo-ut/temporalNLI_dataset
JaNLI
出処:https://github.com/verypluming/JaNLI
JNLI (JGLUE)
出処:https://github.com/yahoojapan/JGLUE
JSeM
出処:https://github.com/DaisukeBekki/JSeM
JSICK
出処:https://github.com/verypluming/JSICK
QA (Question Answering)
JEMHopQA
出処:https://github.com/aiishii/JEMHopQA
NIILC
出処:https://github.com/mynlp/niilc-qa
RC (Reading Comprehension)
JSQuAD (JGLUE)
出処:https://github.com/yahoojapan/JGLUE
MC (Multiple Choice question answering)
JCommonsenseQA (JGLUE)
出処:https://github.com/yahoojapan/JGLUE
EL (Entity Linking)
chABSA
出処:https://github.com/chakki-works/chABSA-dataset
FA (Fundamental Analysis)
Wikipedia Annotated Corpus
出処:https://github.com/ku-nlp/WikipediaAnnotatedCorpus
タスク一覧
Reading prediction
Named entity recognition
Dependency parsing
Predicate-argument structure analysis
Coreference resolution
MR (Mathematical Reasoning)
MAWPS
出処:https://github.com/nlp-waseda/chain-of-thought-ja-dataset
STS (Semantic Textual Similarity)
JSTS (JGLUE)
出処:https://github.com/yahoojapan/JGLUE
参考資料
https://speakerdeck.com/olachinkei/llm-jp-eval-ri-ben-yu-da-gui-mo-yan-yu-moteruno-zi-dong-ping-jia-turunokai-fa-nixiang-kete?slide=24
https://www.youtube.com/watch?v=HavcE7GgstY