ReLE Chinese LLM Benchmark

ReLE (chinese-llm-benchmark) is a continuously updated Chinese LLM evaluation and leaderboard project covering education, medical, finance, legal, reasoning and other capability dimensions.

jeinlee1991 · Since 2023-06-04

Loading score...

GitHub Website

Overview

ReLE (chinese-llm-benchmark) is a community-maintained Chinese LLM evaluation and leaderboard project that provides fine-grained benchmarks across education, medical, finance, legal, reasoning, language understanding and multimodal tasks.

Key features

Extensive benchmark suites and leaderboards, including a large badcase repository.
Regular releases and changelogs, with tools for model selection and leaderboard viewing.
Provides leaderboard data and visualization for easy analysis and debugging.

Use cases

Model evaluation and selection for research and engineering teams focused on Chinese-language LLMs.
Course material and reading lists for MLSys/LLM classes with Chinese benchmarks.
Error analysis and badcase collection to improve model robustness.

Technical characteristics

Maintained as GitHub Markdown; easy to update via PRs and community contributions.
Includes leaderboards, downloadable data and badcase visualizations for rapid analysis.
Some content integrates with a dedicated site (nonelinear.com) for online presentation.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

ReLE Chinese LLM Benchmark

Overview

Key features

Use cases

Technical characteristics

Score Breakdown

Related Resources

Agenta

DeepEval

Dingo