Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

We introduce Medprompt, based on a composition of several prompting strategies.

With Medprompt, GPT-4 achieves state-of-the-art results on all nine of the benchmark datasets in the MultiMedQA suite.

OpenAIはGPT-4をプロンプトエンジニアリングで医療分野で使えるとしている

Figure 4

プロンプトエンジニアリングを全部載せたらどれだけ上がるか