Wan2.2

Wan2.2 is a family of large-scale video generation models that support text/image/audio-to-video tasks, featuring MoE and high-compression VAE designs.

Alibaba · Since 2025-07-28

Loading score...

GitHub Website Demo

Overview

Wan2.2 is an open family of high-quality video generation models supporting text-to-video, image-to-video, text-image-to-video and speech-to-video tasks. It introduces MoE architectures and high-compression VAE to balance quality and efficiency.

Key Features

MoE architecture: increases effective model capacity through specialized experts.
Multimodal support: text, image and audio to video pipelines with animation/replacement modules.
Rich ecosystem: released weights, inference code, ComfyUI and Diffusers integrations, and online demos.

Use Cases

Film and short-video content generation and stylistic editing.
Research and benchmarking for video generation and MoE/compression strategies.
Prototyping and demos via Hugging Face Spaces or self-hosted services.

Technical Characteristics

High-compression VAE and MoE design for efficient high-resolution video generation.
Multiple inference modes (single-GPU, multi-GPU, FSDP + DeepSpeed) and model conversion tooling.
Apache-2.0 license and active maintenance with academic publication.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

Wan2.2

Overview

Key Features

Use Cases

Technical Characteristics

Score Breakdown

Related Resources

AgentScope

Tongyi DeepResearch

Higress