Open R1

Open R1 is Hugging Face's open reproduction of DeepSeek-R1, providing training, evaluation and data generation pipelines for researchers to reproduce and extend R1 capabilities.

Author: Hugging Face

Since: 2025-01-24

Visit Website GitHub

Overview

Open R1 is an open reproduction of DeepSeek-R1 maintained by Hugging Face. It exposes end-to-end recipes for distillation, supervised fine-tuning (SFT), and RL training (GRPO), together with datasets and evaluation tooling to reproduce reasoning capabilities.

Key features

End-to-end scripts and Makefile targets for data distillation, SFT, GRPO, and evaluation.
Published datasets such as Mixture-of-Thoughts and OpenR1-Math for training reasoning models.
Integrations with vLLM, multiple sandbox providers (E2B, Morph), and high-performance training backends.

Use cases

Research reproduction: reproduce DeepSeek-R1 training and evaluation results.
Data generation & distillation: create reasoning traces and distilled datasets for model training.
Scalable training pipelines: example configs for Slurm, Accelerate and DeepSpeed clusters.

Technical characteristics

Implemented primarily in Python; depends on specific PyTorch/vLLM versions and CUDA toolchain.
Supports DDP and DeepSpeed (ZeRO-2/3), tensor/data parallelism and optimized kernels for large-model training.
Includes evaluation scripts (lighteval) and benchmark recipes to reproduce metrics on math, code and reasoning benchmarks.

Open R1

Overview

Key features

Use cases

Technical characteristics

Resource Info

Related Resources

Evaluation Guidebook

TRL

LeRobot