Detailed Introduction
CoTyle is an open-source image generation project from the Kolors team at Kuaishou that defines the “code-to-style” task: training a discrete style codebook so that each numeric code corresponds to a stable and reproducible visual style, and conditioning image generation models on these codes. The goal is to control global style consistency and diversity using a single numeric code without complex reference images or long prompts.
Main Features
- Trained discrete style codebook for explicit style representation and sampling.
- Conditioning of T2I diffusion models on style codes to achieve style-consistent outputs.
- Reference implementations for batch and single-sample inference, Gradio demo, and released weights for reproducibility and extension.
Use Cases
CoTyle is suitable for creative workflows requiring deterministic style control, such as generating large sets of assets in a unified visual style, quickly assembling stylistic illustration libraries for products, and research on style representation and generation consistency. It serves designers and researchers exploring style spaces and automated asset production.
Technical Features
The project combines a discrete style codebook with diffusion models: a style embedding is learned from image collections, an autoregressive model is trained to model style code distributions, and generated numeric codes guide the diffusion model for style-conditioned synthesis. Implementation relies on modern diffusion frameworks and training pipelines, with optional inference acceleration to trade off speed and quality.