GraphRAG Related Surveys
2025-09-04
GPT5.icon
1) What to extract (type of extraction target)
A. Basic three-piece set
Entity / Relation / Event (Event): unique expressions within and between documents, relational links, events + roles.
B. Time, causality, and time series
Extraction of temporal relationships (before/after/simultaneous) and event causality.
Used to construct timelines and causal graphs across documents.
C. Arguments, Claims, and Grounds (Provnance)
It works for accountability, verification, and reproduction of answers.
D. Core reference and normalization
Entity identity resolution (alias/abbreviation/notation shaking) and linking (e.g., Wikidata).
Directly related to the consistency and search reproducibility of the knowledge graph (essential pre-processing for implementation).
2) How to use (processing patterns using graph/DAG)
(1) Search/RAG (graph-guided)
Responds to questions with local search (surrounding subgraphs + source text chunks) and global search (across communities and reports) or DRIFT (a combination of both). Strong on "big picture" type questions. (Microsoft GitHub, [Microsoft https://www.microsoft.com/en-us/research/ blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/?utm_source=chatgpt.com]) HippoRAG: Aiming for multi-stage inference class accuracy even with single-stage acquisition using Personalized PageRank on KG to expand related nodes (up to 20% improvement reported by HotpotQA and others). Contributes to improved efficiency of multi-leap queries and search. (arXiv) LightRAG / LlamaIndex-KG: A lightweight implementation system that turns a triplet extraction → subgraph RAG. Suitable for integration of proprietary graphs and for "low-cost graphs in combination.
G-Retriever / GRAG: Specialized in graph QA, GNN + LLM + RAG for fast and accurate giant graph QA. SURVEY: An overview of GraphRAG's standard workflow (G-Indexing -> G-Retrieval -> G-Generation). Recommended reading before implementation. (arXiv) (2) Summary and overall understanding
GraphRAG (QFS oriented): Scale query-focused summarization with a combination of KG + community summarization. For cross-company reports and extracting themes from huge corpora. (arXiv) Hierarchical summary index for DAG/tree systems. (arXiv) (3) Agent/dialog memory
Theanine (memory system): does not assume memory deletion. Timelines past events with time/causality links and injects **"transition of events/causality "** as context into response generation. Also presented the evaluation framework TeaFarm (NAACL 2025). Specializes in preserving the context of long-term dialogues. (arXiv) MemGPT / LongMem: direction to handle long history in memory hierarchy and virtual context management (memory OS/banks rather than graphs); complementary to Theanine's time and causality graphs. (arXiv, [NeurIPS Proceedings https://proceedings.neurips.cc/paper_files/ paper/2023/file/ebd82705f44793b6f9ade5a669d0f0bf-Paper-Conference.pdf?utm_source=chatgpt.com]) 4) Verification, explanation and consistency check
The PROV holds the edges of the origin (who/what/when generated), and the evidence path is presented when responding; the AIF holds the argument graph of claim - evidence - counterargument, allowing the user to say "why this is so"; and the AIF holds the argument graph of claim - evidence - counterargument, allowing the user to say "why this is so". (W3C, [Wikipedia https://en.wikipedia.org/wiki/Argument_Interchange_ Format?utm_source=chatgpt.com]) 3) Comparison of representative approaches (to the point)
table:_
Strain Main extraction Structure used Strengths/uses (key points)
GraphRAG (Microsoft) KG extraction with LLM + community detection -> hierarchy summary KG + hierarchy (global/local/DRIFT) Strong overall and multi-document QFS. Full operational guidance and implementation. (Microsoft GitHub, [Microsoft https://www.microsoft.com/en-us/research/ blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/?utm_source=chatgpt.com]) HippoRAG Entity/Relationship KG + Personalized PageRank Low cost high accuracy with multi-leap QA. Strong even for single-stage acquisition. (arXiv) LlamaIndex-KG Triplet Extraction Subgraph RAG Easy to configure to add a KG retriever to an existing RAG. (LlamaIndex) Theanine Dialogue memory fragments Timeline linked by time and causality Long-term dialogue to understand the "history of change" and reflect it in responses. (arXiv) RAPTOR Embedded -> Cluster Hierarchical summary tree (Tree/DAG) Search huge corpus step by step from top concepts. (arXiv) 4) Implementation intuition (design patterns)
(a) Data model design
Property Graph or RDF: Ease of query (Cypher) or standard vocabulary/interoperability (RDF/OWL, PROV-O)? It is easier to attach the source/generation process with PROV from the beginning. (W3C) Node type design: Entity / Event / Claim / Source / Community / Memory, etc. Time (occurrence / observation / validity period) and causality are primary class citizens.
ID and identification: core reference integration (separate descriptions of the same person/organization). Paying the cost of knowledge maintenance in the initial phase stabilizes multi-jump searches in the later stages.
(b) Extraction pipeline (minimum configuration)
1. segmentation (paragraph/sentence/speech) → language preprocessing (NER, co-ref)
3. time/causality link: extract event time/sequence/causality (rules + LLM or existing methods) (arXiv) 4. normalization/identification (KB link, dictionary, embedding)
5. storage (Graph DB / RDF store) + dual index (vector & graph)
(c) Manner of Retrieval
Local: seed node from question and neighborhood expansion + source text chunks together (GraphRAG: Local). (Microsoft GitHub) Ranking: hybrid PPR/short path/centricity and embedding similarity; PPR is the key to HippoRAG (good starting point for implementation). (arXiv) (d) Generation
Pass "structure" to the prompt: subgraph (nodes/edges/provenance) + minimal source text.
Hierarchical summary synthesis: GraphRAG's step-by-step synthesis of community report -> partial response -> final summary is less likely to break down even at large scale. (arXiv) (e) Memory (dialogue/agent)
(f) Evaluation
QA/Summary: Accuracy (EM/F1), inclusiveness/diversity (e.g., QFS settings for GraphRAG articles). (arXiv) Memory integration understanding: see if you can really use "past event transitions" with a counterfactual test like Theanine's TeaFarm. (arXiv) (g) Risks and countermeasures
Extraction illusions: LLM extraction produces false links/over-abstraction. Mitigate by two-step extraction (candidate -> validation) or source-required.
5) First, implementation templates (minimum configuration)
1. base: vector RAG (embedding + preservation of original text).
2. KG Combination: generate and store triples in LlamaIndex KnowledgeGraphIndex (can be Neo4j or RDF). At the same time, PROV-O is used to connect the sources. (LlamaIndex, [W3C https://www.w3 .org/TR/prov-overview/?utm_source=chatgpt.com]) 3. graph utilization:
If you have a lot of multiple bounces, install HippoRAG's PPR. (arXiv) 4. dialogue memory: event nodes (utterances/actions/preferences) for each user are linked by time and causality to maintain a timeline (Theanine's policy). (arXiv) 6) Usage guidelines (roughly)
**ask for "big picture/theme "**: GraphRAG (global/DRIFT) or RAPTOR. (arXiv) 7) Supplement: Graph-of-Thoughts system (graphing of "thoughts")
The target of extraction is not "external text" but LLM intermediate thoughts, but a system that enhances inference with graph structure (reuse, branching, confluence). It can be used in conjunction with external KG for the framework of inference. (arXiv) Reference and starting point links (excerpts)
GraphRAG (How it works, implementation, and operations blog) (Microsoft GitHub, [Microsoft https://www. microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/?utm_source=chatgpt.com]) HippoRAG(NeurIPS'24) (arXiv) Theanine(NAACL'25) (arXiv) GraphRAG Survey(2024) (arXiv) Next Steps (Memo for Practical Use)
In PoC: 1) Triplet extraction + Subgraph RAG (inexpensive) → 2) Community summary/DRIFT ("big picture" question support) → 3) PPR (multi-jump reinforcement) → 4) Dialogue timeline (long-term memory), in that order of expansion.
Metrics: mix QA (EM/F1), inclusivity/diversity (QFS systems), counterfactuals (TeaFarm). (arXiv) Policy: Provnance (source path) on all output. Effective for future audit and observability. (W3C) If necessary, the data at hand/requirements (corpus size, question patterns, latency budget) will be fleshed out to a minimum configuration blueprint.
---
This page is auto-translated from /nishio/GraphRAG関連サーベイ using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.