Detailed Introduction
GraphSearch is a graph-centric Retrieval-Augmented Generation (RAG) workflow designed to connect graph construction, graph retrieval, and generative inference into a reproducible pipeline. The project integrates multiple GraphRAG methods (e.g., LightRAG, HyperGraphRAG), provides datasets, graph-building utilities, and example inference code to support multi-hop retrieval and QA over structured relational data.
Main Features
- Graph construction tools: build knowledge graphs and contextual graph indexes from text with preprocessing and sharding options.
- Multiple GraphRAG methods: built-in or compatible strategies for graph-based retrieval and fusion to facilitate comparison and extension.
- Reproducible pipelines: includes datasets, build/index/inference scripts to support experimental reproducibility and benchmarking.
- Research resources: paper citation and dataset references to aid academic reproduction and evaluation.
Use Cases
- Multi-hop QA and knowledge QA: improve retrieval accuracy and generation quality on data with complex entity relations.
- Domain search: targeted retrieval and summarization over legal, medical, or scientific knowledge graphs.
- Method development and baselines: provide pipelines and examples for researchers to implement and compare new approaches.
- Teaching and demos: serve as an educational starting point for graph-based retrieval and RAG concepts.
Technical Features
- Language & dependencies: primarily Python, with scripts and dependency instructions for standard experimental environments.
- Modular architecture: decouples graph construction, retrieval, fusion, and generation components for easy substitution.
- Extensible input formats: supports various text and graph inputs, enabling integration with existing corpora and knowledge bases.
- Open-source licensing: repository uses permissive open-source licenses to support research and commercial adoption.