TileLang

TileLang is a domain-specific language for high-performance AI kernels that simplifies writing GPU/CPU/accelerator operators.

Author: Tile AI

Added Date: 2025-10-02

Open Source Since: 2024-10-03

Visit Website GitHub

Overview

TileLang (tile-lang) is a DSL designed for implementing high-performance operators (e.g., GEMM, FlashAttention) on GPUs and CPUs. Built on top of TVM, it provides concise Pythonic syntax and tooling for performance engineering.

Key features

Concise DSL and Python API for operator expression and layout annotations.
Multi-backend support (CUDA, HIP, CPU) with device-specific optimizations and examples.
Comprehensive examples and benchmark suites, including MLA decoding, FlashMLA and dequantize GEMM.

Use cases

Implementing and optimizing kernels for deep learning workloads.
Performance tuning on cloud GPUs and accelerators (H100, A100, MI300X, etc.).
Research and engineering workflows connecting high-level models to low-level, optimized kernels.

Technical details

Core implementation uses C++ and Python; relies on TVM for compilation and JIT workflows.
Offers source build instructions, pip packages and nightly builds for quick experimentation.
Includes benchmark scripts and device-specific examples to reproduce reported performance results.

TileLang

Overview

Key features

Use cases

Technical details

Resource Info

Related Resources

Glow

LangREPL

MONAI