Dia2

An open-source streaming dialogue text-to-speech (TTS) model and inference codebase for real-time conversational audio generation.

Nari Labs · Since 2025-11-17

Loading score...

GitHub Website

Detailed Introduction

Dia2 is an open-source text-to-speech (TTS) model and inference implementation from Nari Labs focused on streaming conversational audio. The model can begin generating audio after receiving the initial input tokens and supports conditioning on audio prefixes to maintain speaker consistency and contextual continuity in multi-turn interactions. The repository provides 1B and 2B model checkpoints, example scripts, and quickstart instructions for research and deployment.

Main Features

Streaming generation: starts synthesis without waiting for the full text, reducing response latency.
Conditional generation: supports audio-prefix conditioning for speaker consistency and smoother conversation flow.
Multiple scales: model checkpoints at different sizes (1B, 2B) to balance quality and resource use.
Open license: released under Apache-2.0 for research and non-proprietary use.

Use Cases

Real-time voice for conversational assistants and virtual characters, improving naturalness and responsiveness.
Reply generation in voice-based dialog systems with multi-turn context handling.
Research and teaching for TTS conditional generation, model comparison, and voice control experiments.

Technical Features

Inference implementation based on Python and the uv runtime, compatible with Hugging Face checkpoints and CUDA acceleration (recommended CUDA 12.8+).
Generation length is limited by context steps (around 2 minutes); outputs include audio tokens, waveform, and timestamps.
Command-line examples and a Gradio demo are provided for quick verification and integration.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

Dia2

Detailed Introduction

Main Features

Use Cases

Technical Features

Score Breakdown

Related Resources

AutoSubs

Axolotl

Cactus