Time flies—it’s already been a month since I joined Dynamia. In this article, I want to share my observations from this past month: why AI Native Infra is a direction worth investing in, and some considerations for those thinking about their own career or technical direction.
Introduction
After nearly five years of remote work, I officially joined Dynamia last month as VP of Open Source Ecosystem. This decision was not sudden, but a natural extension of my journey from cloud native to AI Native Infra.
But this article is not just about my personal choice. I want to answer a more universal question: In the wave of AI infrastructure startups, why is compute governance a direction worth investing in?
For the past decade, I have worked continuously in the infrastructure space: from Kubernetes to Service Mesh, and now to AI Infra. I am increasingly convinced that the core challenge in the AI era is not “can the model run,” but “can compute resources be run efficiently, reliably, and in a controlled manner.” This conviction has only grown stronger through my observations and reflections during this first month at Dynamia.
This article answers three questions: What is AI Native Infra? Why is GPU virtualization a necessity? Why did I choose Dynamia and HAMi?
What Is AI Native Infra
The core of AI Native Infrastructure is not about adding another platform layer, but about redefining the governance target: expanding from “services and containers” to “model behaviors and compute assets.”
I summarize it as three key shifts:
- Models as execution entities: Governance now includes not just processes, but also model behaviors.
- Compute as a scarce asset: GPU, memory, and bandwidth must be scheduled and metered precisely.
- Uncertainty as the default: Systems must remain observable and recoverable amid fluctuations.
In essence, AI Native Infra is about upgrading compute governance from “resource allocation” to “sustainable business capability.”
Why GPU Virtualization Is Essential
Many teams focus on model inference optimization, but in production, enterprises first encounter the problem of “underutilized GPUs.” This is where GPU virtualization delivers value.
- Structural idleness: Small tasks monopolize large GPUs, leaving them idle for long periods.
- Pseudo-isolation risks: Native sharing lacks hard boundaries, so a single task OOM can cause cascading failures.
- Scheduling failures: Some users queue for GPUs while others occupy but do not use them, leading to both shortages and idleness.
- Fragmentation waste: There may be enough total GPU, but not enough full cards, making efficient packing impossible.
- Vendor lock-in anxiety: Proprietary, tightly coupled solutions make migration costs uncontrollable.
In short: GPUs must not only be allocatable, but also splittable, isolatable, schedulable, and governable.
The Relationship Between HAMi and Dynamia
This is the most frequently asked question. Here is the shortest answer:
- HAMi: A CNCF-hosted open source project and community focused on GPU virtualization and heterogeneous compute scheduling.
- Dynamia: The founding and leading company behind HAMi, providing enterprise-grade products and services based on HAMi.
Open source projects are not the same as company products, but the two evolve together. HAMi drives industry adoption and technical trust, while Dynamia brings these capabilities into enterprise production environments at scale. This “dual engine” approach is what makes Dynamia unique.
What HAMi Provides
HAMi (Heterogeneous AI Computing Virtualization Middleware) delivers three key capabilities on Kubernetes:
- Virtualization and partitioning: Split physical GPUs into logical resources on demand to improve utilization.
- Scheduling and topology awareness: Place workloads optimally based on topology to reduce communication bottlenecks.
- Isolation and observability: Support quotas, policies, and monitoring to reduce production risks.
Currently, HAMi has attracted over 360 contributors from 16 countries, with more than 200 enterprise end users, and its international influence continues to grow.
Market Trends: The AI Infrastructure Startup Wave
AI infrastructure is experiencing a new wave of startups. The vLLM team’s company raised $150 million, SGLang’s commercial spin-off RadixArk is valued at $4 billion, and Databricks acquired MosaicML for $1.3 billion—all pointing to a consensus: Whoever helps enterprises run large models more efficiently and cost-effectively will hold the keys to next-generation AI infrastructure.
Against this backdrop, the positioning of Dynamia and HAMi is even clearer. Many teams focus on “model performance acceleration” and “inference optimization” (like vLLM, SGLang), while we focus on “resource scheduling and virtualization”—enabling better orchestration of existing accelerated hardware resources.
The two are complementary: the former makes individual models run faster and cheaper, while the latter ensures that compute allocation at the cluster level is efficient, fair, and controllable. This is similar to extending Kubernetes’ CPU/memory scheduling philosophy to GPU and heterogeneous compute management in the AI era.
Why AI Native Infra Is Worth the Investment
My observations this month have convinced me that compute governance is the most undervalued yet most promising area in AI infrastructure. If you are considering a career or technical investment, here is my assessment:
First, this is a real and urgent pain point
Model training and inference optimization attract a lot of attention, but in production, enterprises first encounter the problem of “underutilized GPUs”—structural idleness, scheduling failures, fragmentation waste, and vendor lock-in anxiety. Without solving these problems, even the fastest models cannot scale in production. GPU virtualization and heterogeneous compute scheduling are the “infrastructure below infrastructure” for enterprise AI transformation.
Second, this is a clear long-term track
Frameworks like vLLM and SGLang emerge constantly, making individual models run faster. But who ensures that compute allocation at the cluster level is efficient, fair, and controllable? This is similar to extending Kubernetes’ success in CPU/memory scheduling to GPU and heterogeneous compute management in the AI era. This is not something that can be finished in a year or two, but a direction for continuous construction over the next five to ten years.
Third, this is an open and verifiable path
Dynamia chose to build on HAMi as an open source foundation, first solving general capabilities, then supporting enterprise adoption. This means the technical direction is transparent and verifiable in the community. You can form your own judgment by participating in open source, observing adoption, and evaluating the ecosystem—rather than relying on the black-box promises of proprietary solutions.
Fourth, this is a window of opportunity that is opening now
AI infrastructure is being redefined. Investing in its construction today will continue to yield value in the coming years. The vLLM team’s company raised $150 million, SGLang’s commercial spin-off RadixArk is valued at $4 billion, Databricks acquired MosaicML for $1.3 billion—all validating the same trend: Whoever helps enterprises run large models more efficiently will hold the keys to next-generation AI infrastructure.
I hope to bring my experience in cloud native and open source communities to the next stage of HAMi and Dynamia: turning GPU resources from a “cost center” into an “operational asset.” This is not just my career choice, but my judgment and investment in the direction of next-generation infrastructure.
jimmysong) to join the HAMi community focused on GPU virtualization and heterogeneous compute scheduling.If you are also interested in HAMi, GPU virtualization, AI Native Infra, or Dynamia, feel free to reach out.
Summary
From cloud native to AI Native Infra, my observations this month have only strengthened my conviction: The true upper limit of AI applications is determined by the infrastructure’s ability to govern compute resources.
HAMi addresses the fundamental issues of GPU virtualization and heterogeneous compute scheduling, while Dynamia is driving these capabilities into large-scale production. If you are also looking for a technical direction worth long-term investment, AI Native Infra—especially compute governance and scheduling—is a track with real pain points, a clear path, an open ecosystem, and an opening window of opportunity.
Joining Dynamia is not just a career choice, but a commitment to building the next generation of infrastructure. I hope the observations and reflections in this article can provide some reference for you as you evaluate technical directions and career opportunities.
If you are also interested in HAMi, GPU virtualization, AI Native Infra, or Dynamia, feel free to reach out.
