Qwen

An open-source large-scale multilingual pre-trained and conversational model by Alibaba Cloud/Tongyi team, supporting multi-scale models and quantized deployment.

Author: Alibaba

Added Date: 2025-09-24

Open Source Since: 2023-08-03

Visit Website GitHub

Introduction

Qwen (通义千问) is an open-source series of large-scale pre-trained and conversational models developed by the Alibaba Cloud/Tongyi team. It includes various scales such as 1.8B, 7B, 14B, and 72B, and provides Chat series models and quantized inference solutions. This project covers comprehensive usage instructions from model download, quantization (GPTQ/Int4/Int8), deployment (vLLM, FastChat, Docker) to fine-tuning (LoRA, Q-LoRA), making it suitable for both research and engineering practices. The main text is limited to 500 words, with clear paragraphs highlighting features and scenarios.

Key Features

Multi-scale models: Provides models and Chat variants of 1.8B, 7B, 14B, and 72B, supporting long context (up to 32K).
Open-source and quantization: Releases Int4/Int8/GPTQ quantized models for easy deployment in resource-constrained environments.
Rich toolchain: Includes finetune, quantize, demo, Docker, and API examples, supporting ecosystems like Transformers, ModelScope, and vLLM.

Use Cases

Large-scale dialogue systems and customer service robots.
Local offline inference and model compression deployment (edge or private cloud).
Research benchmarks and fine-tuning (LoRA/Q-LoRA) experimental platforms.

Technical Features

Supports 32K long context (for some models), optimizing long text modeling with NTK-aware, window-attn, and other techniques.
Provides KV-cache quantization, Flash-Attention support, and multi-card deployment solutions, compatible with various inference frameworks (PyTorch, vLLM).
Comprehensive documentation and examples, including performance comparisons, quantization guides, and deployment scripts, facilitating engineering reproducibility.

Qwen

Introduction

Key Features

Use Cases

Technical Features

Resource Info

Related Resources

Spring AI Alibaba

Qwen3-VL

ROLL