KubeRay

KubeRay is the Ray Project's open-source Kubernetes operator for deploying and managing Ray applications on Kubernetes.

Author: Ray Project

Added Date: 2025-11-03

Open Source Since: 2020-10-29

Detailed Introduction

KubeRay is the Ray Project’s open-source Kubernetes operator for deploying, scaling, and managing Ray applications on Kubernetes. It provides custom resources like RayCluster, RayJob, and RayService to simplify lifecycle management, autoscaling, and high-availability for distributed training, batch processing, and online inference workloads. User-facing documentation is hosted on Ray’s docs site while the repository contains development and maintenance resources.

Main Features

CRDs for RayCluster, RayJob, and RayService to automate cluster lifecycle and autoscaling.
Integrations with the Kubernetes ecosystem (Prometheus, Grafana, Ingress, queueing systems, etc.).
kubectl ray plugin and an experimental dashboard to simplify operations.
Support for production training and inference workloads in cloud-native environments.

Use Cases

KubeRay is suitable for running Ray workloads on Kubernetes: large-scale training jobs, batch data processing, LLM online inference, and services that require elastic scaling. Organizations can integrate KubeRay into CI/CD, monitoring, and scheduling systems to build observable and resilient ML platforms.

Technical Features

Implemented primarily in Go, KubeRay follows the Operator pattern and distributes Helm charts and examples. The repo includes tooling, development docs, and quickstarts. See Ray’s Kubernetes docs for official user guides: Ray Kubernetes docs .

KubeRay

Detailed Introduction

Main Features

Use Cases

Technical Features

Resource Info

Related Resources

Ray

LightX2V

Costrict