A Survey of Large Language Models
2023年に13回更新された
https://raw.githubusercontent.com/RUCAIBox/LLMSurvey/main/assets/LLMs-0623-final.png
網掛けがOpen
(他にもいろいろな図がある)
TABLE 1
LLaMaの派生モデル
https://github.com/RUCAIBox/LLMSurvey/raw/main/assets/llama-0628-final.png
Fig 6: Ratios of various data sources in the pre-training data for existing LLM
Webページ、対話、論文、コード比率さまざま
Fig 4
text-davinci-002 に RLHF して 003
Fig 2:の整理、同感
5.1 Instruction Tuning
5.1.1 Formatted Instance Construction
Fig. 11
Formatting NLP Task Datasets
Formatting Daily Chat Data
InstructGPT
人間のlabeler
Formatting Synthetic Data
7 Capacity AND Evaluation
Table 14
Basic
Language Generation能力の評価
Language Modeling
Conditional Text Generation
Code Synthesis
HumanEval
Knowledge Utilization能力の評価
Closed-Book QA
Open-Book QA
Knowledge COmpletion
Complex Reasoning能力の評価
Knowledge Reasoning
Symbolic Reasoning
Mathematical Reasoning
GSM8K
Advanced
Human Alignment
Honestness
Helpfulness
Harmlessness
Interaction with External Environment
Household
Website Environment
Open World
Tool Manipulation
Search Engine
Code Executor
Calculator
Model Interface
Data Interface
Table 16にベンチマークでの比較表
Table 15 (7.3.2)
Benchmark
Big-Bench
HELM
Human
Chatbot Arena
Model
AlpacaEval
MT-Bench