Training language models to follow instructions with human feedback

InstructGPT論文

In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.

outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters

SFT における高品質なインストラクションが重要な役割を担っていることを報告