A guide to building long-term compounding knowledge infrastructure. See details on GitHub .

ReLE Chinese LLM Benchmark

ReLE (chinese-llm-benchmark) is a continuously updated Chinese LLM evaluation and leaderboard project covering education, medical, finance, legal, reasoning and other capability dimensions.

Overview

ReLE (chinese-llm-benchmark) is a community-maintained Chinese LLM evaluation and leaderboard project that provides fine-grained benchmarks across education, medical, finance, legal, reasoning, language understanding and multimodal tasks.

Key features

  • Extensive benchmark suites and leaderboards, including a large badcase repository.
  • Regular releases and changelogs, with tools for model selection and leaderboard viewing.
  • Provides leaderboard data and visualization for easy analysis and debugging.

Use cases

  • Model evaluation and selection for research and engineering teams focused on Chinese-language LLMs.
  • Course material and reading lists for MLSys/LLM classes with Chinese benchmarks.
  • Error analysis and badcase collection to improve model robustness.

Technical characteristics

  • Maintained as GitHub Markdown; easy to update via PRs and community contributions.
  • Includes leaderboards, downloadable data and badcase visualizations for rapid analysis.
  • Some content integrates with a dedicated site (nonelinear.com) for online presentation.

Comments

ReLE Chinese LLM Benchmark
Resource Info
🌱 Open Source 📊 Benchmark 📝 Evaluation