A curated list of AI tools and resources for developers, see the AI Resources .

Open-dLLM: Open Diffusion Large Language Models

An open-source stack for diffusion-based large language models that covers pretraining, evaluation, inference and checkpoints.

Detailed Introduction

Open-dLLM is an open-source project for diffusion-based large language models (LLMs). It provides an end-to-end stack covering raw data processing, pretraining, evaluation, inference, and distribution of checkpoints. The repository includes Open-dCoder — a code-generation variant — along with training pipelines, evaluation harnesses, and published model weights on Hugging Face for reproducibility.

Main Features

  • End-to-end, reproducible training pipelines from data preparation to large-scale training.
  • An open evaluation suite covering HumanEval, MBPP, code infilling and custom metrics for diffusion LLMs.
  • Easy-to-use inference and sampling scripts for experimentation and deployment.
  • Published checkpoints on Hugging Face to enable reproduction and transfer learning.

Use Cases

  • Researchers exploring diffusion generative techniques for LLMs and their optimization.
  • Engineering teams reproducing experiments, training custom models, or fine-tuning existing checkpoints for specific tasks.
  • Teaching and benchmarking: a reproducible open benchmark for code generation and infilling tasks.

Technical Features

  • Training objective based on Masked Diffusion Model (MDM) adapted for code generation and infilling.
  • Integration with VeOmni and lm-eval-harness for dataset handling and benchmarking.
  • Transparent configs and experiment specifications that simplify migration across compute environments and checkpoint uploads to Hugging Face.
Open-dLLM: Open Diffusion Large Language Models
Resource Info
🌱 Open Source 🧬 LLM 🏗️ Model 🏋️ Training 📝 Evaluation