Cloud-native and Kubernetes phase: focused on container orchestration and platform fundamentals, including Kubernetes, Cloud Native Go, Cloud Native Java, Cloud Native Patterns, and Cloud Native Infrastructure.
AI-Native Infrastructure
Driving the implementation and evolution of AI-native infrastructure for uncertainty.
What Is AI-Native Infrastructure?
Infrastructure redesigned for AI systems — not retrofitted from cloud-native stacks.
- NON-DETERMINISM AI workloads are non-deterministic by nature
- AGENT-FIRST Agents, not services, are primary execution unit
- FIRST-CLASS RESOURCES GPU, context, and tokens become first-class resources
- GOVERNANCE > DEPLOY Scheduling and governance matter more than deployment
Research Evidence: Timeline & Method
From production pain points to structured expression, writing is how I validate and iterate AI-native infrastructure ideas.
Writing Timeline
Service mesh and microservices phase: deepened governance and traffic architecture through Istio, migration-focused microservice architecture, and Envoy-centered engineering practices.
AI Native Infra and AI phase: built a methodology from AI engineering to infrastructure with the RAG handbook, agentic design patterns, AI handbook, GPU scheduling/virtualization, AI Native Infrastructure, and AI Infra Dao.
Research Method Cards
Problem-First
Start from real production pain points before introducing concepts and abstractions.
Systematic Decomposition
Break complex topics into resource model, runtime, platform engineering, and governance.
Verifiable Claims
Anchor conclusions in concrete projects, measurable signals, and reproducible cases.
Long-Horizon Iteration
Use posts, books, and talks to cross-validate ideas and continuously refine the boundary of practice.
AI OSS Landscape
Turn methodology into execution: a curated and continuously updated directory of open-source AI projects and tools.
Explore AI Open Source Resources
Browse a structured AI resource list for agents, AI coding tools, model infrastructure, and engineering workflows.
View AI Resource ListLatest Practical Notes
Recent engineering updates and practical notes that continue the research threads above.
When GPUs Move Toward Open Scheduling: Structural Shifts in AI Native Infrastructure
A CTO/VP view on open GPU scheduling: CDI, Kubernetes DRA, virtualization data planes, ecosystem governance, and lock-in risk.
AI Learning Resources: 44 Curated Collections from Our Cleanup
A curated collection of AI learning resources we removed from the AI Resources list: awesome lists, courses, tutorials, and cookbooks. These educational materials deserve their own spotlight.
Core Technology Domains
Research scope at a glance: three tracks that define the practical boundary of AI-native infrastructure.
AI Native Infrastructure
I focus on GPU virtualization, inference, and agent runtime and engineering abstractions, thinking about how to deliver capabilities to production environments stably and efficiently.
Cloud Native
I study Kubernetes's boundaries and evolution under AI workloads, including resource scheduling, elastic scaling, and multi-tenant governance.
Open Source
I participate in and promote the AI-Native Infrastructure open source ecosystem from an engineer's perspective, valuing verifiable, evolvable designs and collaboration methods.
About Jimmy Song
Jimmy focuses on AI-Native Infrastructure and computing governance, with long-term research on GPU virtualization, heterogeneous scheduling, and system-level architecture for AI workloads. He is Open Source Ecosystem VP at Dynamia.ai, CNCF Ambassador, and founder of the Cloud Native Community (China), and continues driving the shift from cloud-native to AI-native engineering.
