Document model server compatibility and config options #537

liu-cong · 2025-03-19T17:40:09Z

Fixes #482

Part of #523

k8s-ci-robot · 2025-03-19T17:40:16Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: liu-cong
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

netlify · 2025-03-19T17:40:25Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`8bd3485`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67db1399eeafdc000879c4cf
😎 Deploy Preview	https://deploy-preview-537--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

ahg-g

Thanks

ahg-g · 2025-03-19T18:21:46Z

mkdocs.yml

@@ -61,6 +61,7 @@ nav:
      - Getting started: guides/index.md
      - Adapter Rollout: guides/adapter-rollout.md
      - Metrics: guides/metrics.md
+      - Supported Model Servers: guides/model-server.md


why under guides?

where else do you suggest?

I would put it under overview after the implementations section

site-src/guides/model-server.md

ahg-g · 2025-03-19T18:23:45Z

site-src/guides/model-server.md

+
+## Use Triton with TensorRT-LLM Backend
+
+You need to specify the metric names when starting the EPP container. Add the following to the `args` of the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/296247b07feed430458b8e0e3f496055a88f5e89/config/manifests/inferencepool.yaml#L48).


We should make this a flag in the helm chart, the flag is the model-server name, and we should be setting those automatically

Co-authored-by: Abdullah Gharaibeh <[email protected]>

k8s-ci-robot · 2025-03-19T19:10:40Z

@liu-cong: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-gateway-api-inference-extension-test-unit-main	`8bd3485`	link	true	`/test pull-gateway-api-inference-extension-test-unit-main`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

liu-cong · 2025-03-19T20:37:32Z

/hold I will wait for the Triton metric PR to be merged.

Document model server compatibility and config options

c880c68

k8s-ci-robot requested a review from kfswain March 19, 2025 17:40

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 19, 2025

k8s-ci-robot requested a review from robscott March 19, 2025 17:40

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 19, 2025

ahg-g reviewed Mar 19, 2025

View reviewed changes

liu-cong and others added 3 commits March 19, 2025 11:57

Update site-src/guides/model-server.md

dd40239

Co-authored-by: Abdullah Gharaibeh <[email protected]>

Update site-src/guides/model-server.md

25a61d8

Co-authored-by: Abdullah Gharaibeh <[email protected]>

Update site-src/guides/model-server.md

8bd3485

Co-authored-by: Abdullah Gharaibeh <[email protected]>

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document model server compatibility and config options #537

Document model server compatibility and config options #537

liu-cong commented Mar 19, 2025

k8s-ci-robot commented Mar 19, 2025

netlify bot commented Mar 19, 2025 •

edited

Loading

ahg-g left a comment

ahg-g Mar 19, 2025

liu-cong Mar 19, 2025

ahg-g Mar 19, 2025

ahg-g Mar 19, 2025

k8s-ci-robot commented Mar 19, 2025

liu-cong commented Mar 19, 2025


		## Use Triton with TensorRT-LLM Backend

		You need to specify the metric names when starting the EPP container. Add the following to the `args` of the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/296247b07feed430458b8e0e3f496055a88f5e89/config/manifests/inferencepool.yaml#L48).

Document model server compatibility and config options #537

Are you sure you want to change the base?

Document model server compatibility and config options #537

Conversation

liu-cong commented Mar 19, 2025

k8s-ci-robot commented Mar 19, 2025

netlify bot commented Mar 19, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

ahg-g left a comment

Choose a reason for hiding this comment

ahg-g Mar 19, 2025

Choose a reason for hiding this comment

liu-cong Mar 19, 2025

Choose a reason for hiding this comment

ahg-g Mar 19, 2025

Choose a reason for hiding this comment

ahg-g Mar 19, 2025

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 19, 2025

liu-cong commented Mar 19, 2025

netlify bot commented Mar 19, 2025 •

edited

Loading