You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
InferencePool represents a set of Inference-focused Pods and an extension that will be used to route to them. Within the broader Gateway API resource model, this resource is considered a "backend". In practice, that means that you'd replace a Kubernetes Service with an InferencePool. This resource has some similarities to Service (a way to select Pods and specify a port), but has some unique capabilities. With InferenceModel, you can configure a routing extension as well as inference-specific routing optimizations. For more information on this resource, refer to our InferencePool documentation or go directly to the InferencePool spec.
Specifically "With InferenceModel, you can configure a routing extension...."
This section is about InferencePool, why is it referring to InferenceModel? Seems this is a mistake and should refer to InferencePool.
I think I'm seeing some inconsistency in the docs wrt names and spec definitions.
From this page: https://gateway-api-inference-extension.sigs.k8s.io/concepts/api-overview/#inferencepool
Specifically "With InferenceModel, you can configure a routing extension...."
This section is about InferencePool, why is it referring to InferenceModel? Seems this is a mistake and should refer to InferencePool.
Additionally, that particular part of the doc is talking about configuring a "routing extension". Checking the spec on the docs, it doesn't show how to configure that (ie, there should be an extensionRef field, but there isn't):
https://gateway-api-inference-extension.sigs.k8s.io/reference/spec/#inferencepoolspec
Let me know if these are oversights that need to be corrected, or if I am missing something.
The text was updated successfully, but these errors were encountered: