Day One in Amsterdam: Kubernetes Is Rethinking AI

KubeCon Europe 2026 Day One: How Kubernetes is adapting to the AI infrastructure wave and the evolution of the GPU resource layer.

Today marks my first day at KubeCon Europe 2026. The most striking feeling is: the world is vast, but this community is truly small.

Figure 1: Jimmy on the first day of KubeCon EU 2026
Figure 1: Jimmy on the first day of KubeCon EU 2026

One strong impression stands out:

The world is big, but this circle is really small.

Old Friends, New Cycle

At the Maintainer Summit, I met many familiar faces—

Colleagues from Ant Group, friends from Tetrate, and some people I’ve known for nearly a decade. Together, we’ve journeyed from the early days of Kubernetes, Service Mesh, and cloud native infrastructure to today.

In a sense, this generation has fully experienced:

  • The rise of Kubernetes
  • The standardization of Cloud Native
  • The microservices and service mesh boom
  • And now, the era of AI Infrastructure

This isn’t about “new people entering the field,” but rather—

The same group stepping into a new technology cycle.

What Is the Maintainer Summit Discussing?

If you ask:

What is the Kubernetes community most concerned about right now?

Today’s answer is very clear:

👉 How to run AI workloads better on Kubernetes

Figure 2: The Maintainer Summit’s main topic is AI Infra
Figure 2: The Maintainer Summit’s main topic is AI Infra

Many topics at the Maintainer Summit revolved around:

  • Scheduling models for LLM / AI workloads
  • GPU / accelerator resource management
  • Integrating inference systems with Kubernetes
  • Redefining the roles of data plane vs. control plane
  • How observability tools like OTel monitor AI workloads

In other words:

Kubernetes hasn’t been replaced by AI; it’s actively “absorbing” AI.

Key Signal: GPUs Are Becoming the “Infrastructure Layer”

Today, I had an in-depth discussion with CNCF TOC, Red Hat, and the vLLM community.

The core question was:

How should GPUs be “platformized”?

Some consensus is already clear:

  • GPUs are no longer just devices
  • They are now a schedulable, partitionable, and shareable resource layer
Figure 3: TOC meeting discussing GPU resource management and LLM Serving integration
Figure 3: TOC meeting discussing GPU resource management and LLM Serving integration

At the Maintainer Summit in Amsterdam, we had deep discussions with CNCF TOC, Red Hat, and the vLLM community about GPU resource management and LLM Serving integration in Kubernetes scenarios, and explored potential collaboration between vLLM and HAMi.

Behind this is a major paradigm shift:

PastNow
GPU = Node resourceGPU = Infrastructure layer
Exclusive useMulti-tenant sharing
Static bindingDynamic scheduling
Managed within frameworksUnified management at the platform layer

This is exactly what we’ve been working on in HAMi.

HAMi: From “Project” to “Reference Pattern”

Another interesting change today:

HAMi is no longer just a “community project”—it’s becoming:

A reference implementation (reference pattern) for AI Infra

Figure 4: Li Mengxuan, CTO of Dynamia, sharing HAMi’s design and practice at KubeCon EU 2026 Maintainer Summit
Figure 4: Li Mengxuan, CTO of Dynamia, sharing HAMi’s design and practice at KubeCon EU 2026 Maintainer Summit

This is reflected in several ways:

  • Invited to present at the Maintainer Summit
  • Participating in CNCF TOC discussions
  • Involved in incubating review demos
  • Exploring joint content with the vLLM community (even discussing a joint blog 👀)

Especially in conversations with Red Hat and vLLM, a clear trend emerged:

GPU resource management and LLM serving are becoming coupled

That is:

  • Upper layer: vLLM / inference frameworks
  • Lower layer: GPU scheduling / sharing

A new “interface layer” is gradually forming.

This is a direction worth betting on.

Figure 5: At the TAG Workshop, HAMi was discussed as an Incubating demo
Figure 5: At the TAG Workshop, HAMi was discussed as an Incubating demo

A Caution: The AI Infra Startup Boom Hasn’t Really Begun

At the same time, I have a somewhat “counterintuitive” observation:

We haven’t yet seen a large wave of AI Infra (K8s-focused) startups.

Most companies I saw today:

  • Many are pivoting from CI/CD, Service Mesh, or Gateway
  • Many are traditional cloud vendors extending into AI
  • Many are working on models, agents, or even lower-level tech

But those truly focused on:

“Making AI workloads run better on Kubernetes”

There are actually not many startups at this layer.

This could mean two things:

1) This Layer Isn’t Fully Formed Yet

Currently, most activity is at:

  • The model layer (LLM / foundation models)
  • The application layer (Agent / Copilot)

But not at:

  • The scheduling layer
  • The resource layer
  • The runtime layer

2) Or, the Barrier to Entry Is Very High

Because at its core, this is:

The intersection of Cloud Native × GPU × AI workload

It’s not just “wrapping AI,” but a fundamental re-architecture at the infrastructure level.

My Take

If we break down the AI technology stack:

Agent / Application
LLM Serving (vLLM, etc.)
AI Runtime / Scheduling
GPU Resource Layer
Hardware

Most innovation today is concentrated in:

  • The top two layers (Agent / LLM)

But the real long-term moat lies in:

  • The middle two layers (Runtime + Resource Layer)

And Kubernetes is very likely to remain:

The default platform for this middle layer

Summary

Today’s takeaway:

Kubernetes is not obsolete; it’s being redefined.

And our generation is shifting from:

“Cloud Native Builders”

to:

“AI Infrastructure Builders”

More to come tomorrow.

Jimmy Song

Jimmy Song

Focusing on research and open source practices in AI-Native Infrastructure and cloud native application architecture.

Post Navigation

Comments