gpt-oss

gpt-oss is an open-weight model series released by OpenAI, designed for high-reasoning and customizable developer use cases.

OpenAI · Since 2025-06-23

Loading score...

GitHub Website

Overview

gpt-oss is OpenAI’s open-weight model series (including gpt-oss-120b and gpt-oss-20b) that provides publicly available weights for research and engineering reproduction. The project is released under the Apache-2.0 license and targets high-reasoning, customizable deployments with support for multiple inference backends and tool integrations. This page summarizes its purpose, main features, and common application scenarios.

Key features

Open-weight release (Apache-2.0) enabling research and commercial deployment.
Two scale options: designed for both high-performance single-GPU inference and lighter deployments (120B / 20B).
Harmony response format and tool support (browser, python) with multiple inference backends (Transformers, vLLM, Triton, Metal).

Use cases

Research and large-scale inference: suitable for tasks that require strong reasoning capabilities and traceable outputs.
Local and offline serving: examples and guidance for running with Ollama, vLLM and other local runtimes.
Developer tooling and fine-tuning: reference implementations useful for tuning, benchmarking, and engineering integration.

Technical highlights

Harmony format: structured response format for composable tool calls and structured outputs.
Multi-backend & quantization: support for MXFP4 quantization to reduce memory footprint and improve inference efficiency.
Reference implementations: PyTorch, Triton and Metal examples provided to aid engineering portability and optimization.

Core Content

Core Content

Technology

Technology

More

More

AI Infrastructure

AI Infrastructure

Explore

Explore

Connect

Connect

Quick Links

Quick Links

LinkedIn

LinkedIn

Follow on X

Follow on X

gpt-oss

Overview

Key features

Use cases

Technical highlights

Score Breakdown

Related Resources

Codex

OpenAI Agents (Python)

Whisper