Candle

Candle by Hugging Face: a minimalist, high-performance ML framework in Rust designed for serverless inference and lightweight deployments.

Author: Hugging Face

Added Date: 2025-09-30

Open Source Since: 2023-06-19

Visit Website GitHub

Introduction

Candle is a Rust-first, high-performance machine learning framework from Hugging Face. It targets serverless inference and lightweight deployments, with backends for CPU, CUDA, and WASM.

Key features

Minimalist, Rust-based core optimized for performance and small binaries.
Multi-backend support (optimized CPU, CUDA, WASM) and model format compatibility (safetensors, npz, ggml, PyTorch).
Extensive examples and browser demos covering LLaMA, Whisper, Stable Diffusion and more.

Use cases

Deploying models where Python runtime is undesirable or too heavy.
Serverless or edge deployments that require fast startup and small footprints.
Integrations that need Rust-native high-performance inference kernels.

Technical details

Repository is largely Rust (~80%) with CUDA and Metal kernels; it provides modular crates like candle-core, candle-nn, and candle-examples.
Supports quantized inference and various model backends for fast, production-ready deployments.

Candle

Introduction

Key features

Use cases

Technical details

Resource Info

Related Resources

LeRobot

text-embeddings-inference

LightEval