UniLM

UniLM is a unified pre-training paradigm and project collection from Microsoft Research, spanning language understanding and generation and spawning multiple foundation and multimodal subprojects.

Microsoft Research · Since 2018-12-31

Loading score...

GitHub Website

Overview

UniLM is Microsoft Research’s unified pre-training approach and project repository that supports both understanding and generation tasks, and has produced foundation models and multimodal projects such as MiniLM, LayoutLM and BEiT used widely in research and production.

Key Features

Unified pre-training objectives that cover both understanding and generation, facilitating transfer to diverse downstream tasks.
A broad collection of subprojects addressing text, document, vision and speech, plus engineering-ready implementations and model checkpoints.
Tooling, examples and pretrained weights that simplify reproduction and deployment.

Use Cases

Researchers reproducing papers and comparing models; engineering teams building downstream applications and fine-tuning pipelines.
Document understanding, OCR, vision+language tasks, text generation and multilingual applications.

Technical Details

Integrates efficient architectures and pretraining methods (e.g., MiniLM, BEiT, X-MoE), emphasizing scalability and practical efficiency.
Open-source licensing and extensive documentation enable community collaboration and engineering adoption.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

UniLM

Overview

Key Features

Use Cases

Technical Details

Score Breakdown

Related Resources

Agent Lightning

AutoSubs

Axolotl