Giskard OSS

An open-source evaluation and testing framework to detect performance, bias, and security issues in AI systems.

Giskard-AI · Since 2022-03-06

Loading score...

Introduction

Giskard is an open-source evaluation and testing framework that helps developers automatically detect performance, bias and security issues in LLM-based and traditional ML models. It includes tooling from RAG evaluation to vision model tests.

Key Features

Automated Scan: detect hallucinations, prompt injections, sensitive data leaks and robustness issues.
RAGET: automatically generate evaluation datasets for RAG applications and evaluate generator/retriever components.
Multi-model and environment support: works with any model via simple wrappers and runs locally, in Colab or in CI.
Visualization & interaction: provides a web UI, documentation and examples to inspect and share results.

Use Cases

Pre-deployment safety checks: automatically detect harmful or risky outputs before release.
Regression testing: monitor performance and fairness during model iteration.
RAG evaluation: generate test sets and evaluate retrieval+generation pipelines.

Technical Highlights

CLI and Python API for scripted and interactive workflows.
Active releases and community support, with extensive docs and examples.
Modular design to extend custom checks and integrate into evaluation pipelines.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

Giskard OSS

Introduction

Key Features

Use Cases

Technical Highlights

Score Breakdown

Related Resources

Agenta

ReLE Chinese LLM Benchmark

DeepEval