Open Source Practices and Innovations in Kubernetes AI Application Infrastructure: A Study of Solo.io Projects

In recent years, the cloud-native space has seen a surge of new open source projects targeting AI applications. Having long focused on the intersection of Kubernetes and AI, I have been exploring which open source tools can help enterprises better implement LLM inference services and Agentic AI-driven automated operations. The Solo.io team has open sourced several projects in this area, such as kgateway , kagent , agentgateway , and kmcp . These projects are active in the technical directions I follow and offer distinctive features. This article systematically reviews the design concepts, key capabilities, and practical enterprise value of these “AI + Kubernetes” open source tools, analyzes their differences from traditional solutions based on my research experience, and provides recommendations for their use.

Click to toggle the mind map - Solo.io Open Source Projects

Overview of Solo.io Open Source Projects

This article discusses the following four Solo.io open source projects, summarized below:

kgateway
- Originally Gloo Gateway, built on Envoy Proxy and the Kubernetes Gateway API.
- Provides traditional API gateway capabilities and extends to an AI Gateway, supporting Prompt Guard for prompt protection, Inference Extension for inference service orchestration, and multi-model scheduling and failover.
- Use cases: Unified entry point for LLM traffic governance, multi-model load balancing, secure and compliant access for AI applications.
kagent
- Kubernetes-native Agentic AI framework that enables platform engineering and operations teams to define and run AI agents in the cluster.
- Implements intelligent diagnostics and automated operations (AgentOps) through MCP protocol tools, multi-agent collaboration, and declarative CRD management.
- Use cases: Automated troubleshooting, performance optimization, intelligent inspection, and task orchestration.
agentgateway
- A new data plane proxy designed specifically for AI Agent communication, donated to the Linux Foundation.
- Natively supports A2A (Agent-to-Agent) and MCP protocols, providing security governance, observability, tool registration, and federation.
- Use cases: Serves as the communication bus for multi-agent systems, unified gateway for tool invocation, and cross-team tool sharing and governance.
kmcp
- Toolkit for MCP Server development and operations.
- Offers scaffolding generation (init), image building (build), K8s deployment (deploy), and CRD controller (install) to simplify the lifecycle management of MCP tool services.
- Use cases: Rapid development of MCP tool services with native Kubernetes deployment, integrated with agentgateway for security and governance.

The following sections will introduce each project in detail.

kgateway: Kubernetes AI Gateway

kgateway is Solo.io’s open source implementation of the Kubernetes Gateway API, originally launched as Gloo Gateway in 2018. It is built on Envoy Proxy for the data plane and features a highly extensible control plane for unified management of north-south traffic in clusters. With the rise of AI workloads in 2025, kgateway has added support for LLM service invocation and Agent scenarios on top of its mature gateway capabilities, making it a next-generation cloud-native gateway for the future. As a Kubernetes Ingress/API Gateway, kgateway fully adheres to the Gateway API standard, using custom resources (GatewayClass, Gateway, HTTPRoute, etc.) for routing and policy configuration, and leverages Envoy as the L7 data plane for high-performance forwarding.

kgateway Architecture and Principles

kgateway adopts a control plane + data plane architecture. The control plane listens for changes in Gateway API resources via Kubernetes CRDs and efficiently generates Envoy configuration updates, with official data showing fast response and low resource consumption. The data plane consists of Envoy Proxy processes that execute routing, traffic control, and policy enforcement based on configuration. Developers can use kgateway like a standard Ingress/Egress gateway while benefiting from enhanced features. The diagram below illustrates how kgateway processes AI gateway requests, including prompt guard mechanisms and backend LLM service integration:

Figure 1: kgateway as AI Gateway processing LLM requests (with Prompt Guard review)

Key AI Scenario Capabilities

As an “AI Gateway,” kgateway provides a range of enhanced features for generative AI applications:

Prompt Guard: Built-in prompt guardrails allow platform teams to configure rules via TrafficPolicy resources to inspect and filter LLM request/response content. For example, string or regex patterns can be set to intercept requests containing sensitive words like “credit card” and return custom errors, or mask suspected credit card numbers in model responses. kgateway also supports sending prompt data to external content moderation services (e.g., OpenAI Moderation API) for pass/fail decisions. This intermediary defense layer decouples security policies from applications, enabling centralized control and a “kill-switch” feature.
Inference Extension & Intelligent Routing: kgateway supports the Gateway API Inference Extension, introducing custom resources like InferencePool to manage backend model service pools. Operators can create InferencePools and specify Endpoint Selection Extensions (ESE). When requests are routed to an InferencePool, ESE intelligently selects model instances based on real-time metrics, such as modelName, request priority, queue backlog, GPU cache utilization, and LoRA adapter status, optimizing resource usage and service performance.
Multi-Model Governance & Extension: Through the Inference Extension, kgateway easily supports multi-model scheduling. Operators can define separate InferencePools and InferenceModels for different models/versions, integrate with open source inference engines like vLLM, and update backend models without changing frontend calls. Combined with TrafficPolicy, kgateway can enforce model API-specific requirements (e.g., context length, concurrency limits). Mature API gateway plugins (auth, rate limiting, circuit breaking) are also available for high-concurrency, secure, and resilient AI scenarios.

Quick Deployment & Usage

kgateway offers Helm Charts and other installation methods for easy deployment in Kubernetes clusters. By default, it acts as a general Gateway API controller; to enable AI extensions, simply activate the relevant feature flags. Operators can then define Gateway/HTTPRoute resources and TrafficPolicy with ai fields to configure AI gateway capabilities. For example, the TrafficPolicy snippet below configures regex checks and custom rejection responses for the “openai” HTTPRoute, implementing basic Prompt Guard functionality:

apiVersion: gateway.kgateway.dev/v1alpha1
kind: TrafficPolicy
metadata:
  name: openai-prompt-guard
  namespace: kgateway-system
spec:
  targetRefs:
  - kind: HTTPRoute
    name: openai
    group: gateway.networking.k8s.io
  ai:
    promptGuard:
      request:
        customResponse:
          message: "Request rejected due to inappropriate content"
        regex:
          action: REJECT
          matches:
          - pattern: "credit card"

After deployment, developers can use kubectl to submit Gateway and Route definitions, routing traffic to LLM or InferencePool backends. In inference service orchestration scenarios, kgateway serves as the AI traffic gateway at the cluster entry point, providing unified routing, control, and security for model invocation.

kagent: Kubernetes-Native Agentic AI Framework

kagent is the first open source autonomous Agent framework for Kubernetes, designed to help platform and DevOps engineers build and run AI agents for automated operations and intelligent decision-making in cloud-native environments. In March 2025, Solo.io open sourced kagent and announced its donation to the CNCF at KubeCon Europe to foster community development. kagent is dubbed “agentic AI for K8s,” bringing advanced reasoning and recursive planning capabilities to AI systems, enabling them to autonomously complete complex multi-step tasks. kagent pioneers this concept in cloud-native, providing a foundational platform for running AI Agents on Kubernetes.

kagent Architecture and Principles

kagent architecture consists of three key layers:

Tools: Functional modules callable by Agents, following the open MCP (Model Context Protocol) standard. Tools are “capability units” such as viewing Pod logs, querying Prometheus metrics, or generating K8s resource manifests. kagent includes a suite of built-in Kubernetes tools and supports custom extensions. Each tool is essentially an “MCP server,” with Agents sending commands and receiving results via standard protocol.
Agents: Autonomous entities capable of planning and sequential actions. Each Agent can use one or more tools to achieve goals, and can form teams of sub-Agents: a “Planner Agent” decomposes tasks and delegates to specialized Agents. Agents use large language models for reasoning and decision-making, parsing natural language instructions, interacting with tools, and feeding back results. Agents act as intelligent process automata, executing tasks until completion.
Framework & Controller: kagent provides a simple declarative API (CRD) and controller to manage Agent lifecycles and execution. Developers define Agent configurations (tool sets, LLM provider, initial prompts) in YAML, which the kagent controller deploys and runs. The execution engine, built on Microsoft’s AutoGen framework, handles LLM integration, tool scheduling, and state management. CLI and Web UI are available for interactive sessions and monitoring.

The diagram below shows kagent’s architecture and execution flow in Kubernetes:

Figure 2: kagent framework: Agent using LLM and multiple tools to execute tasks

Key Features and Capabilities

kagent combines LangChain-style Agent execution with Kubernetes infrastructure, enabling AI Agents to perceive and operate cloud-native systems:

Cloud-Native Operations Automation: With built-in tools, Agents can automate many routine ops tasks. For example, if a service is unavailable, an Agent can fetch Pod lists, retrieve error logs, and attempt Pod restarts or config rollbacks—all autonomously.
Complex Fault Diagnosis & Optimization: For multi-component issues, Agents perform stepwise reasoning to narrow down problems, e.g., detecting CPU spikes, locating errors in logs, and adjusting resource quotas, completing detection to remediation automatically—enabling AgentOps for faster response and stability.
Multi-Agent Collaboration: kagent supports team Agent mode, where a Planner Agent coordinates multiple Executor Agents. For example, one Agent handles traffic splitting, another manages DB migration, with the Planner orchestrating execution and feedback. Reliable communication is ensured via open standards like MCP and emerging A2A (Agent-to-Agent) protocols.
Security & Observability: kagent provides audit tracking, metrics reporting, and plans deep OpenTelemetry integration for tracing LLM calls and tool executions. Integration with agentgateway adds authentication and access control for Agent invocations, mitigating risks and ensuring observability and accountability.

Usage and Examples

Building Agents with kagent is straightforward. The official kagent CLI and Helm deployment make installation easy. Before installing, configure LLM API credentials (e.g., OpenAI API key as env var). After installation, follow this workflow to create and run your first Agent:

Define Agent CR: Edit a YAML file specifying the Agent (e.g., GPT-4 model, built-in tools, initial prompt). Applying it to the cluster triggers the kagent controller to create the Agent instance.
Start Session: Use kagent CLI to enter Agent REPL, select the Agent, and submit tasks conversationally (e.g., “Query Pods in kagent namespace”). The Agent calls GetResources and returns a Markdown-formatted answer.
View Process: Ask the Agent “Which tools did you use?” to see the steps taken. CLI/UI shows LLM prompts, answers, tool parameters, and results—helpful for debugging and prompt optimization.
Continuous Run & Triggers: Agents can run continuously, listening for triggers (e.g., scheduled system checks or event subscriptions). On trigger, the Agent autonomously executes tasks, such as a “fault inspection Agent” scanning metrics hourly and diagnosing anomalies—enabling AgentOps. With GitOps, Agent definitions can be version-controlled in code repos.

Note: As Agent decisions rely on LLM outputs, production use should include thorough testing and safeguards (e.g., restrict callable tools, require confirmation for critical actions). kagent is exploring feedback mechanisms (e.g., trace replay, result evaluation) to improve controllability and reliability.

agentgateway: Native Gateway for AI Agents

agentgateway is a newly open sourced project from Solo.io, donated to the Linux Foundation, positioned as the first “AI-native” proxy communication gateway. Unlike a modified traditional API gateway, it is purpose-built for autonomous agent network communication. agentgateway provides out-of-the-box security, observability, and governance for Agent-to-Agent (A2A) and Agent-to-Tool interactions, supporting emerging standards like A2A and MCP protocols. If AI Agents are like microservices, agentgateway acts as their “service mesh”—but goes further, understanding and handling AI-specific protocols and patterns.

Origins & Design: Solo.io initially considered extending Envoy for A2A/MCP support, but found new protocols required a fundamentally different proxy architecture. Inspired by Istio Ambient Mesh’s lightweight ztunnel proxy, Solo.io built agentgateway from scratch for AI scenarios. It is compact, efficient, and secure, proven in Ambient Mesh production, and quickly adapts to new AI protocols. As Solo.io’s CEO said: “Traditional API gateways can’t keep up with the rapid evolution of Agent architectures. We built agentgateway for the AI era’s protocols, patterns, and scale—it will be the hub for next-gen intelligent systems.”

agentgateway Architecture & Features

agentgateway is a Layer 7 proxy with flexible deployment: as a standalone gateway service or as a Sidecar/DaemonSet for Agents/tools. Configuration supports static files and plans Gateway API integration for declarative management. Core modules include:

Protocol Support: Built-in parsing and forwarding for Agent2Agent (A2A) and Model Context Protocol (MCP). A2A (by Google, etc.) enables cross-framework Agent communication via gRPC, etc. MCP standardizes Agent calls to external tools/data sources. agentgateway acts as a unified message exchange center for Agents and MCP tool services.
Security Governance: Implements zero trust security, authenticating and auditing all Agent<->Tool calls. Agents can be assigned credentials for cross-service identity and RBAC. Sensitive tools (e.g., DB ops) can have fine-grained authorization, with all interactions logged for audit. This centralized policy layer enables safe, large-scale AI Agent adoption.
Observability: Deep OpenTelemetry integration traces every Agent conversation, tool/model invocation, forming “AI call chain” logs for debugging and trust. Performance metrics (latency, success rate, tool call frequency) are aggregated for ops monitoring.
Tool Registration & Federation: agentgateway introduces Tool Federation, exposing multiple MCP tool services as a single unified MCP entry point (similar to rube ). Agents connect to one agentgateway address to access all tools, with a centralized registry for discovery and versioning. Existing REST APIs can be auto-wrapped as MCP interfaces, solving the “tool sprawl” problem and simplifying large-scale integration.

The diagram below shows agentgateway’s role in multi-Agent and multi-tool environments:

Figure 4: agentgateway connects multiple Agents and tools/models, serving as unified communication and governance hub

Enterprise Use Cases

agentgateway addresses enterprise challenges in deploying Agentic AI: lack of visibility, control, and security. Typical scenarios include:

Agent Communication Bus: In complex AI workflows, different Agents need to exchange intents or information. agentgateway enables standard protocol communication, applies security policies, and acts as the message hub and firewall.
Cross-Team Tool Sharing: Different teams may develop their own Agents and tools. agentgateway unifies tool registration and access, enabling MCP tool federation and building a shared tool asset library with access control and monitoring.
Sensitive Operation Audit & Control: For high-risk tool invocations (e.g., delete DB), agentgateway can enforce approval workflows and log all actions for traceability.
Unified Visual Ops: agentgateway provides a developer portal to view all Agents and tools, their health, and debug interactions—lowering the barrier and cost for AI Agent ops.

Deployment & Integration

agentgateway is available as a standalone executable, installable via script. YAML config defines ports and backend tool addresses. For Kubernetes, agentgateway integrates with kgateway (e.g., as a custom backend), enabling dynamic routing via Gateway API. As a CNCF sandbox project, agentgateway will enhance K8s-native deployment, possibly via Operator. Designed for seamless collaboration with kagent, kmcp, etc., agentgateway forms the core hub of an AI proxy communication network, connecting all Agents, tools, and models for synergistic effect.

kmcp: MCP Service Development & Ops Toolkit

kmcp (“Kubernetes Model Context Protocol” toolkit) is Solo.io’s open source suite for lightweight MCP tool development and operations. As MCP tool servers proliferate in AI applications, moving from prototype to production is challenging. kmcp addresses this by providing end-to-end support from code scaffolding, image building, to cluster deployment and management.

kmcp Core Features

kmcp focuses on boosting MCP Server development and deployment efficiency:

Project Scaffolding: kmcp init quickly generates MCP service projects with templates for Python FastAPI, Go-kit, etc. (TypeScript, Java planned). Templates include project structure, dependencies, tests, Dockerfile, etc., ensuring best practices and maintainability.
Image & Release: kmcp build packages container images (multi-arch supported), and kmcp deploy deploys to Kubernetes. Supports stdio (local process) and HTTP Streamable modes. Parameters allow namespace, port, env vars, and --dry-run for GitOps. kmcp deploy opens an “MCP Inspector” for live testing.
Kubernetes Native Integration: kmcp install deploys the kmcp controller and registers MCPServer CRD. Each kmcp deploy creates MCPServer resources, with the controller managing lifecycle and config (Deployment, Service, etc.), enabling declarative deployment and monitoring.
Security & Governance Integration: kmcp services natively integrate with agentgateway, supporting sidecar injection or registry for unified authentication, traffic control, and observability. Security policies (via agentgateway) can restrict Agent access and rate limits, ensuring smooth enterprise adoption.

The diagram below shows kmcp’s development-to-deployment workflow:

Figure 5: kmcp accelerates MCP tool service development, deployment, and integration with Agent/agentgateway

Example: To develop a weather query tool for Agents:

Initialize Project: kmcp init python my-weather-server --author "Your Name" --email "[email protected]" generates a Python FastAPI project with basic query interface and tests.
Implement Logic: Add code to call a weather API and return results; MCP protocol handling is prebuilt.
Build & Release: kmcp build --tag my-weather:v1.0 creates the image; kmcp deploy --image my-weather:v1.0 --namespace tools --port 3000 deploys to Kubernetes. Run kmcp install first to set up the controller and CRD.
Test & Register: Use MCP Inspector to test requests and responses. Register the tool with agentgateway or configure its address for Agents. Now, any Agent can call the weather tool via MCP.

Notes

When developing MCP services with kmcp, follow stateless, fast-start principles for scalability. As MCP tools may be called concurrently by Agents, implement caching or rate limiting to avoid backend API quota exhaustion. kmcp provides parameters for replicas and resource requests. For security, use Kubernetes Secrets for API keys, injected as env vars during deployment. In practice, kmcp enables a “tool-as-a-service” ecosystem, allowing teams to collaboratively develop capability modules, governed via agentgateway for modular, composable AI functionality.

Comparison with Mainstream Solutions

How do Solo.io’s projects differ from traditional solutions like Apache APISIX, Kong Gateway, and native Kubernetes Gateway API implementations?

AI-Native Support: Solo.io’s solutions are optimized for AI scenarios. kgateway has built-in Prompt Guard and LLM load balancing, while APISIX/Kong require custom plugins for such features. agentgateway supports A2A/MCP protocols and Agent call chain monitoring, which traditional gateways do not. Solo.io anticipates the shift from API calls to Agent collaboration, architecting for this trend.
Open Standards Alignment: Solo.io embraces open standards—kgateway fully implements Gateway API and extends it for inference; kagent/agentgateway implement and promote Agent2Agent, MCP, etc. Traditional solutions are more proprietary, though Kong/APISIX are gradually supporting Gateway API.
Data Plane Tech Stack: kgateway is built on Envoy Proxy, offering advanced L7 processing and extensibility. Kong/APISIX are traditionally Nginx/OpenResty-based, which may be less flexible for modern traffic patterns (gRPC, WebSocket, HTTP/2). Envoy is better suited for streaming outputs (e.g., LLM interactions).
Performance & Scalability: kgateway’s control plane is among the fastest and most resource-efficient for Envoy. agentgateway inherits ztunnel’s lightweight, high-concurrency design. Traditional gateways may bottleneck under heavy Agent communication loads.
Maturity & Ecosystem: APISIX/Kong have rich API management features and plugin ecosystems. kgateway is still maturing in API management breadth but leads in AI Gateway capabilities. APISIX recently announced AI Gateway support, but mainly for traffic forwarding; Kong has few AI-specific features. Solo.io’s integrated platform (kgateway, kagent, agentgateway) offers a closed-loop AI-native solution.

In summary, Solo.io’s kgateway, kagent, agentgateway, etc., provide new capabilities for AI application deployment not found in traditional solutions, ideal for teams looking to build forward-looking AI infrastructure. For classic API gateway needs, APISIX/Kong remain reliable. As the paradigm shifts from “calling APIs” to “using Agents,” and open standards like A2A/MCP gain traction, Solo.io’s emerging solutions may lead the next generation of cloud-native architecture.

Summary & Recommended Users

Solo.io’s four major projects collectively enhance the “Kubernetes + AI” infrastructure landscape: kgateway offers a unified AI traffic gateway, strengthening LLM invocation governance and optimization; kagent brings intelligent Agents to clusters for automated ops and complex task execution; agentgateway creates a new network plane for Agent interactions, solving security and governance challenges; kmcp fills the tool service development and ops gap, enabling modular AI capability delivery. Together, they form a comprehensive AI-native cloud-native tech stack.

This suite is recommended for:

Platform Engineering / DevOps Teams: If you’re exploring ChatGPT or LLMs for ops, monitoring, or autonomous operations (AgentOps), kagent and agentgateway are ideal, enabling controlled deployment of AI Agents for higher automation without losing system control.
AI Development & Architecture Teams: For building complex AI apps (multi-Agent workflows, data intelligence apps), Solo.io projects provide a complete platform—from LLM gateway (kgateway), to Agent and tool orchestration (kagent + agentgateway), to reusable AI tools (kmcp)—accelerating development and reducing boilerplate.
Tech-Savvy Startups: If you’re developing AI-native products (conversational BI, multimodal assistants) on Kubernetes, these projects serve as open source accelerators, offering out-of-the-box features (PromptGuard, InferencePool, A2A channels) for rapid prototyping.
Large Enterprise IT Departments: For organizations with strict security/compliance needs, agentgateway’s policy control, audit, and unified entry allow different departments’ Agents and tools to collaborate in a regulated environment. kgateway can also gradually replace or integrate with existing Ingress controllers for enhanced AI traffic management.

Note: Most projects are rapidly evolving (kagent, agentgateway were recently donated to the community); thorough testing and attention to official updates are advised before production use. As AI and cloud-native converge, demand for “AI-as-a-Service” infrastructure will grow. Solo.io’s open source efforts offer valuable ideas and tools. For teams aiming to be at the technology frontier, try these projects, share feedback, and help improve the ecosystem. With these open source tools, making Kubernetes smarter and more automated is no longer a distant goal. As CNCF experts note: “The future of software will be Agent-driven”—now is the time to help build that future.