A guide to building long-term compounding knowledge infrastructure. See details on GitHub .

Wan2.2

Wan2.2 is a family of large-scale video generation models that support text/image/audio-to-video tasks, featuring MoE and high-compression VAE designs.

Overview

Wan2.2 is an open family of high-quality video generation models supporting text-to-video, image-to-video, text-image-to-video and speech-to-video tasks. It introduces MoE architectures and high-compression VAE to balance quality and efficiency.

Key Features

  • MoE architecture: increases effective model capacity through specialized experts.
  • Multimodal support: text, image and audio to video pipelines with animation/replacement modules.
  • Rich ecosystem: released weights, inference code, ComfyUI and Diffusers integrations, and online demos.

Use Cases

  • Film and short-video content generation and stylistic editing.
  • Research and benchmarking for video generation and MoE/compression strategies.
  • Prototyping and demos via Hugging Face Spaces or self-hosted services.

Technical Characteristics

  • High-compression VAE and MoE design for efficient high-resolution video generation.
  • Multiple inference modes (single-GPU, multi-GPU, FSDP + DeepSpeed) and model conversion tooling.
  • Apache-2.0 license and active maintenance with academic publication.

Comments

Wan2.2
Resource Info
Author Wan-AI
Added Date 2025-09-23
Tags
LLM Project OSS