Skip to content

Commit 8650b30

Browse files
committed
removed hf token from cpu based example
Signed-off-by: Nir Rozenbaum <[email protected]>
1 parent 23bab8c commit 8650b30

File tree

2 files changed

+0
-12
lines changed

2 files changed

+0
-12
lines changed

config/manifests/vllm/cpu-deployment.yaml

-10
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,6 @@ spec:
2929
env:
3030
- name: PORT
3131
value: "8000"
32-
- name: HUGGING_FACE_HUB_TOKEN
33-
valueFrom:
34-
secretKeyRef:
35-
name: hf-token
36-
key: token
3732
- name: VLLM_ALLOW_RUNTIME_LORA_UPDATING
3833
value: "true"
3934
ports:
@@ -78,11 +73,6 @@ spec:
7873
- --duplicate-count
7974
- "4"
8075
env:
81-
- name: HF_TOKEN
82-
valueFrom:
83-
secretKeyRef:
84-
name: hf-token
85-
key: token
8676
- name: HF_HOME
8777
value: /adapters
8878
volumeMounts:

site-src/guides/index.md

-2
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,8 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
3434

3535
#### CPU-Based Model Server
3636

37-
Create a Hugging Face secret to download the model [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct). Ensure that the token grants access to this model.
3837
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
3938
```bash
40-
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to Qwen
4139
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml
4240
```
4341

0 commit comments

Comments
 (0)