Overview
Vespa is a distributed engine designed for online AI and big-data workloads. It excels at low-latency retrieval and inference, supporting vector search, custom scoring, and near-real-time indexing. Typical uses include semantic search, recommendation, and online model serving.
Key features
- High-performance vector and text retrieval with hybrid queries.
- Near-real-time indexing and low-latency query serving.
- Scalable distributed architecture for production workloads.
Use cases
- Retrieval layer for RAG systems and semantic search.
- Recommendation and personalized online services.
- Low-latency online inference and model serving.
License
- Apache-2.0 — suitable for enterprise and open-source contributions.