RLHF(Reinforcement Learning with Human Feedback)
InstructGPTなどがそう
https://gyazo.com/6a0caada495a2bbf38ff20383fab3a0d