A guide to building long-term compounding knowledge infrastructure. See details on GitHub .

RouteLLM

RouteLLM is an open-source framework for serving and evaluating LLM routers that routes queries between models to reduce cost while retaining high-quality outputs.

RouteLLM is an open-source framework for routing requests between models, sending simpler queries to cheaper models and complex queries to stronger models to achieve near high-quality performance at much lower cost. The project includes server components, evaluation tooling, and pretrained routers.

Key features

  • Router suite: multiple built-in routers (mf, sw_ranking, bert, etc.) with extensibility for custom routing strategies.
  • Evaluation framework: tools to evaluate router performance on benchmarks (MT Bench, MMLU, GSM8K) and visualize results.
  • OpenAI-compatible server: run an OpenAI-compatible server for seamless client integration.
  • Local & remote model support: route to local models or cloud providers with threshold calibration and cost controls.

Use cases

  • Reduce inference costs in multi-model deployments while preserving response quality.
  • Research and compare routing strategies across benchmarks.
  • Replace single-model clients with a router service to optimize cost/performance tradeoffs.

Technical details

  • Implementation: Python-based with controllers, server and evaluation scripts; examples and benchmarks included.
  • Deployment: provides a Python SDK and server mode; installable via pip or runnable from source.
  • License: Apache-2.0, actively maintained with an associated research paper and benchmark data.

Comments

RouteLLM
Resource Info
Author lm-sys
Added Date 2025-09-30
Open Source Since 2024-06-03
Tags
Open Source LLM Router Evaluation