雰囲気マークアップ

プロンプトをXMLで構造化すると、LLMが雰囲気を読んで、偶然いい感じにやってくれる事が多い

Structured OutputsやAnthropicのTool useみたいに、LLMとユーザーの間で動作するシステムとは別の概念

Anthropic

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/use-xml-tags

プロンプト内で、XMLのような雰囲気で構造化マークアップする

code: prompt

〇〇してください

こういうXMLをこう扱え、という学習はかけていない

There are no canonical “best” XML tags that Claude has been trained with in particular, although we recommend that your tag names make sense with the information they surround.

LLMの学習の副産物として、XMLをプレーンテキストの区切りとして与えると、雰囲気を読んでいい感じに扱う挙動が発生している

OpenAIも言っている

https://cdn.openai.com/API/docs/gpt-5-for-coding-cheatsheet.pdf

https://scrapbox.io/files/68a216465ab0809d7c25525a.png

XML-like、つまりXMLそのものではない

Google

プロンプトを構造化する | Generative AI on Vertex AI | Google Cloud

Microsoft

Document Embedding in Prompts | Microsoft Learn

プロンプト内にドキュメントを含める場合は<documents>で囲む

Prompt engineering techniques - Azure OpenAI | Microsoft Learn

markdownかXMLを使え

markdownの区切り線も有用

Amazon

PromptTemplate - Amazon Bedrock

意味的な区切りにXMLを使え

これはAmazon BedrockのドキュメントなのでAntropicのドキュメントを根拠として挙げている

Best practices to avoid prompt injection attacks - AWS Prescriptive Guidance

Anthropicのドキュメントを指したPrompt Injection手法

本文にXMLを書けば偽のプロンプトを本物だとLLMに信じ込ませれる

LangChain

https://python.langchain.com/docs/how_to/output_parser_xml/

Structured Outputs的な文脈において

一部のLLMはJSONよりXMLの方が出力が安定する場合がある

として、XMLで出力形式を指示する例を紹介している