Opik

Opik is an open-source LLM evaluation and observability platform that helps teams build, evaluate and optimize LLM applications.

Author: Comet

Added Date: 2025-09-27

Open Source Since: 2023-05-10

Visit Website GitHub

Opik is an open-source platform developed by Comet for evaluating, monitoring and optimizing LLM-powered applications. It provides tracing, evaluation pipelines and dashboards to improve model quality and production observability.

Key features

End-to-end tracing: captures LLM calls, conversation context and agent activity at scale.
Advanced evaluation: includes LLM-as-a-judge metrics, dataset-driven evaluations and CI integrations.
Production monitoring & rules: online evaluation rules, feedback scoring and Guardrails for production reliability.

Use cases

Evaluating RAG chatbots and dialog systems during development and regression testing.
Tracing and optimizing multi-step agents and code-assistant workflows.
Monitoring token usage, response quality and anomalies in production with fast investigation tools.

Technical notes

SDKs & integrations: Python and TypeScript SDKs with integrations for LangChain, LlamaIndex, Autogen and others.
Deployments: supports Comet.com cloud or self-hosted deployment (Docker Compose / Kubernetes) with example scripts.
UI & automation: built-in dashboards, Prompt Playground, evaluation rules and Agent Optimizer components.

Opik

Key features

Use cases

Technical notes

Resource Info

Related Resources

Giskard OSS

HELM

LightEval