Omnilingual ASR: Open-Source Multilingual Speech Recognition

An open-source multilingual speech recognition project from Facebook Research (Meta) supporting over 1600 languages.

Author: Facebook Research

Since: 2025-11-06

Detailed Introduction

Omnilingual ASR is an open-source multilingual speech recognition project from Facebook Research (Meta) that supports over 1,600 languages. The project combines scalable zero-shot learning and a flexible family of models so new languages can be added with only a few paired examples. The repository provides end-to-end tooling for data preparation, training recipes, evaluation suites, and an inference pipeline, with datasets and demo spaces published on Hugging Face for reproducibility.

Main Features

Language-conditioned pipeline covering 1,600+ languages and programmatic language lists.
Multiple model families: W2V (SSL), CTC, and LLM-ASR variants to balance compute and accuracy.
End-to-end training and fine-tuning recipes for distributed and reproducible experiments.
Public dataset (CC-BY-4.0) and Hugging Face demos for easy benchmarking and evaluation.

Use Cases

Language inclusion and preservation: quickly build ASR for low-resource languages.
Research & benchmarking: compare architectures (CTC / LLM-ASR / W2V) across many languages.
Engineering deployment: choose appropriate model cards and integrate the inference pipeline for batch or streaming transcription.

Technical Features

Integrates self-supervised W2V models, CTC training, and LLM-based ASR approaches to trade off generality and precision.
Provides a programmable inference pipeline, language ID utilities, and batch processing examples for large-scale transcription.
Transparent asset management for models, tokenizers, and datasets to simplify downloads, caching and reproducibility.

Omnilingual ASR: Open-Source Multilingual Speech Recognition

Detailed Introduction

Main Features

Use Cases

Technical Features

Resource Info

Related Resources

Pixeltable

CoTyle

TOON