From Kubernetes to Qwen: How "Open Source" Has Changed in the AI Era

In the AI era, open source is no longer just “visible source code”—it’s about “loadable models and tunable intelligence.” US companies build moats with closed models, while Chinese vendors open source to build ecosystems. The meaning and practice of open source have fundamentally changed.

Figure 1: The logic of open source in the AI era has fundamentally shifted

Introduction

A decade ago, the cloud native wave saw US companies like Google, Red Hat, and Docker open source massive infrastructure software—Kubernetes, Docker, and Istio became the common language for developers worldwide.

But in the era of large language models, the situation has reversed: US tech giants rarely open source their core models, while Chinese vendors (such as Zhipu, Alibaba, MiniMax, 01.AI, Moonshot, etc.) frequently release open source models. Why this shift? What are the fundamental differences between “AI open source” and “infrastructure open source”?

How Open Source Logic Has Changed: Cloud Native vs. AI Era

The table below compares the core logic, monetization, and resource dependencies of open source in the cloud native and AI eras.

Era	Representative Technologies	Core Open Source Logic	Monetization Model	Resource Dependency
Cloud Native Era (2010s)	Istio, Kubernetes, Docker	Building standards, expanding ecosystem	Managed services (GKE, EKS)	CPU-level compute, community-driven
AI Large Model Era (2020s)	Ollama, GPT, Qwen	Model as asset, data control	API services or closed SaaS	GPU-level compute, centralized

Table 1: Comparison of Open Source Logic: Cloud Native vs. AI Era

Cloud native open source emphasizes “building standards together,” while AI large model open source means “opening up core assets.” Their essence and motivations are fundamentally different.

Why US Companies No Longer Truly Open Source

US tech companies have chosen closed source in the AI era for several reasons:

Business logic has shifted to moat-building: Training costs are high, model weights are the core barrier, and open sourcing means giving up competitiveness.
Compute and data are not reproducible: The community cannot replicate GPT-4-level models.
Security and compliance constraints: Model weights may involve user data and face strict regulation.
“Open” is redefined as “API accessible”: Platforms are open in terms of interfaces, not code or weights.

Why Chinese Companies Are More Willing to Open Source

Chinese vendors actively open source in AI for several reasons:

Open source is used to build ecosystems and brand awareness quickly.
Dual-track model of “open source + commercial license” balances ecosystem growth and revenue.
Data policy environment is more flexible, with policies encouraging proprietary models.
National strategy drives “independent controllability” and “open source ecosystem” as technology priorities.

The Shift of Open Source Platforms: From GitHub to Hugging Face

The platform for open source has also changed. The table below shows the differences between GitHub and Hugging Face in open source forms.

Platform	Era	Core Asset	Open Source Form
GitHub	Software / Cloud Native	Source code (.go /.py /.js)	Compilable, runnable
Hugging Face	AI Models	Model weights + Tokenizer + Inference scripts	Loadable, tunable

Table 2: Comparison of Open Source Forms: GitHub vs. Hugging Face

GitHub mainly open sources “program logic,” while Hugging Face open sources “model intelligence.” Their core assets are completely different.

Core Elements of AI Open Source

Open source in the AI era is not just about code—it includes weights, inference code, and fine-tuning capability. Below are the three key elements.

Figure 2: Directory Structure of Open Source Large Models

Open Weights

All knowledge learned during model training is stored in the weight parameters. Having the weights means owning the “intelligence body” of the model. Closed models (like GPT-4) only provide APIs, not weights.

Open Inference Code

Inference code defines how to load weights, tokenize, perform concurrent computation, and optimize memory. The code below demonstrates how to load the Qwen3 model:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "Qwen/Qwen3-4B-Instruct-2507"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",  # Automatically select FP16 or FP32
    device_map="auto"    # Automatically assign to GPU / CPU
)

# Inference
prompt = "Hello, please briefly explain the principle of fine-tuning large models."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Fine-tuning

Fine-tuning means further training an open source model to adapt it to specific data and scenarios. Common methods include LoRA / QLoRA, which are low-cost and can turn general models into enterprise-specific assistants.

Why Enterprises Prefer Self-Deployment Over API

In practice, enterprises often prefer to self-deploy open source models. The table below summarizes the main reasons and explanations.

Reason	Explanation
Data privacy	Sensitive data cannot be sent externally
Cost control	API is billed per call, expensive long-term
Customizability	Can integrate enterprise knowledge for RAG / Agent
Operability	Can run offline, unified monitoring, compliant deployment

Table 3: Main Reasons for Enterprise Self-Deployment of Open Source Models

Qwen3-4B-Instruct-2507 Model Structure and Usage

Taking Qwen3-4B-Instruct-2507 as an example, here is the directory structure and usage on Hugging Face.

Directory Structure Explanation

After downloading, the model directory looks like this:

Figure 3: Qwen3-4B-Instruct-2507 directory structure

The directory structure of open source models can be illustrated as follows:

In the directory, the model.safetensors file contains the model weights, storing billions of parameters. Other files such as README.md, LICENSE, .gitattributes serve the following purposes:

Category	Files	Purpose
Model Definition	`config.json`, `model.safetensors.*`, `model.safetensors.index.json`	Defines model structure and weights
Tokenizer	`tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`	Defines text input/output encoding
Inference Config	`generation_config.json`	Controls generation strategy (temperature, top_p, etc.)
Metadata	`README.md`, `LICENSE`, `.gitattributes`	Model introduction, license, Git attributes

Table 4: Directory Structure of Open Source Models

Loading and Inference Code Example

The following code shows how to load and run the Qwen3-4B-Instruct-2507 model:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    device_map="auto"
)

# Build input and run inference
prompt = "Hello, explain the meaning of cloud native."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

If GPU memory is insufficient, you can use quantized loading:

mdl = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    device_map="auto",
    load_in_8bit=True
)

Developer Application Scenarios for Open Source LLMs

Open source large models bring developers a wealth of application scenarios. The table below summarizes common directions, uses, and tools.

Direction	Use Case	Tools
Chat / Assistant	Local ChatGPT	LM Studio, TextGen WebUI
Knowledge Base RAG	Private data Q&A	LangChain, LlamaIndex
Agent	Task execution, tool calling	LangGraph, Autogen
Fine-tuning / Adaptation	Custom enterprise knowledge	PEFT, LoRA
Model Service	Deploy as API service	vLLM, TGI, Ollama
Research	Model compression, quantization	BitsAndBytes, FlashAttention

Table 5: Typical Application Scenarios and Tools for Open Source LLMs

Open Source Model Lifecycle

The full lifecycle of an open source model from download to production deployment is as follows:

Download model weights
Load inference code
Local inference or deploy as a service
Fine-tune with proprietary data
Integrate with enterprise RAG / Agent
Launch in production

How to Judge the License of Open Source Large Models

Downloading an open source model = owning a “loadable, trainable, and potentially commercializable intelligent brain”; But whether you can use it for profit depends on its license.

Just like traditional open source projects, whether a large model can be used commercially depends on its license.

How to check:

Hugging Face model homepage (top right) → License: ...
LICENSE or README.md file in the repository root

A brief decision flow:

Summary

In the AI era, open source has shifted from “visible source code” to “loadable models and tunable intelligence.” US vendors maintain business moats with closed models, while Chinese vendors use open source to seize ecosystem leadership. The true value of open source is empowering developers—enabling everyone to own their own “general-purpose brain” and build intelligent infrastructure.