Inference Traffic Control with Gateway API Inference Extension
Exploring how Gateway API Inference Extension brings model-aware inference traffic control through InferencePool, InferenceObjective, and metrics-driven routing.
Inference Traffic Control with Gateway API Inference Extension
Exploring how Gateway API Inference Extension brings model-aware inference traffic control through InferencePool, InferenceObjective, and metrics-driven routing.