ReLE Chinese LLM Benchmark

ReLE (chinese-llm-benchmark) is a continuously updated Chinese LLM evaluation and leaderboard project covering education, medical, finance, legal, reasoning and other capability dimensions.

Author: jeinlee1991

Added Date: 2025-10-02

Open Source Since: 2023-06-04

Visit Website GitHub

Overview

ReLE (chinese-llm-benchmark) is a community-maintained Chinese LLM evaluation and leaderboard project that provides fine-grained benchmarks across education, medical, finance, legal, reasoning, language understanding and multimodal tasks.

Key features

Extensive benchmark suites and leaderboards, including a large badcase repository.
Regular releases and changelogs, with tools for model selection and leaderboard viewing.
Provides leaderboard data and visualization for easy analysis and debugging.

Use cases

Model evaluation and selection for research and engineering teams focused on Chinese-language LLMs.
Course material and reading lists for MLSys/LLM classes with Chinese benchmarks.
Error analysis and badcase collection to improve model robustness.

Technical characteristics

Maintained as GitHub Markdown; easy to update via PRs and community contributions.
Includes leaderboards, downloadable data and badcase visualizations for rapid analysis.
Some content integrates with a dedicated site (nonelinear.com) for online presentation.

ReLE Chinese LLM Benchmark

Overview

Key features

Use cases

Technical characteristics

Resource Info

Related Resources

Giskard OSS

HELM

LightEval