Skip to content

Commit ffe1ed2

Browse files
authoredMay 3, 2024··
Add Llama 3 support to chat bot example (#3131)
* mv example from llama2 to llama * Update path in README.md * Updated llama readme files * Updated chat bot example to support llama3 * fix lint error * More lint issues fixed
1 parent 5c1682a commit ffe1ed2

27 files changed

+186
-133
lines changed
 

‎README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ Refer to [torchserve docker](docker/README.md) for details.
7979
* Microsoft [DeepSpeed](examples/large_models/deepspeed), [DeepSpeed-Mii](examples/large_models/deepspeed_mii)
8080
* Hugging Face [Accelerate](examples/large_models/Huggingface_accelerate), [Diffusers](examples/diffusers)
8181
* Running large models on AWS [Sagemaker](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference-tutorials-torchserve.html) and [Inferentia2](https://pytorch.org/blog/high-performance-llama/)
82-
* Running [Llama 2 Chatbot locally on Mac](examples/LLM/llama2)
82+
* Running [Meta Llama Chatbot locally on Mac](examples/LLM/llama)
8383
* Monitoring using Grafana and [Datadog](https://www.datadoghq.com/blog/ai-integrations/#model-serving-and-deployment-vertex-ai-amazon-sagemaker-torchserve)
8484

8585

@@ -90,8 +90,8 @@ Refer to [torchserve docker](docker/README.md) for details.
9090

9191

9292
## 🏆 Highlighted Examples
93-
* [Serving Llama 2 with TorchServe](examples/LLM/llama2/README.md)
94-
* [Chatbot with Llama 2 on Mac 🦙💬](examples/LLM/llama2/chat_app)
93+
* [Serving Meta Llama with TorchServe](examples/LLM/llama/README.md)
94+
* [Chatbot with Meta Llama on Mac 🦙💬](examples/LLM/llama/chat_app)
9595
* [🤗 HuggingFace Transformers](examples/Huggingface_Transformers) with a [Better Transformer Integration/ Flash Attention & Xformer Memory Efficient ](examples/Huggingface_Transformers#Speed-up-inference-with-Better-Transformer)
9696
* [Stable Diffusion](examples/diffusers)
9797
* [Model parallel inference](examples/Huggingface_Transformers#model-parallelism)

‎examples/LLM/llama2/README.md ‎examples/LLM/llama/README.md

+9-9
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# Llama 2: Next generation of Meta's Language Model
2-
![Llama 2](./images/llama.png)
1+
# Meta Llama: Next generation of Meta's Language Model
2+
![Llama](./images/llama.png)
33

4-
TorchServe supports serving Llama 2 in a number of ways. The examples covered in this document range from someone new to TorchServe learning how to serve Llama 2 with an app, to an advanced user of TorchServe using micro batching and streaming response with Llama 2
4+
TorchServe supports serving Meta Llama in a number of ways. The examples covered in this document range from someone new to TorchServe learning how to serve Meta Llama with an app, to an advanced user of TorchServe using micro batching and streaming response with Meta Llama.
55

6-
## 🦙💬 Llama 2 Chatbot
6+
## 🦙💬 Meta Llama Chatbot
77

8-
### [Example Link](https://github.com/pytorch/serve/tree/master/examples/LLM/llama2/chat_app)
8+
### [Example Link](https://github.com/pytorch/serve/tree/master/examples/LLM/llama/chat_app)
99

10-
This example shows how to deploy a llama2 chat app using TorchServe.
10+
This example shows how to deploy a llama chat app using TorchServe.
1111
We use [streamlit](https://github.com/streamlit/streamlit) to create the app
1212

1313
This example is using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
@@ -16,11 +16,11 @@ You can run this example on your laptop to understand how to use TorchServe, how
1616

1717
![Chatbot Architecture](./chat_app/screenshots/architecture.png)
1818

19-
## Llama 2 with HuggingFace
19+
## Meta Llama with HuggingFace
2020

21-
### [Example Link](https://github.com/pytorch/serve/tree/master/examples/large_models/Huggingface_accelerate/llama2)
21+
### [Example Link](https://github.com/pytorch/serve/tree/master/examples/large_models/Huggingface_accelerate/llama)
2222

23-
This example shows how to serve Llama 2 - 70b model with limited resource using [HuggingFace](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf). It shows the following optimizations
23+
This example shows how to serve meta-llama/Meta-Llama-3-70B-Instruct model with limited resource using [HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct). It shows the following optimizations
2424
1) HuggingFace `accelerate`. This option can be activated with `low_cpu_mem_usage=True`.
2525
2) Quantization from [`bitsandbytes`](https://github.com/TimDettmers/bitsandbytes) using `load_in_8bit=True`
2626
The model is first created on the Meta device (with empty weights) and the state dict is then loaded inside it (shard by shard in the case of a sharded checkpoint).

‎examples/LLM/llama2/chat_app/Readme.md ‎examples/LLM/llama/chat_app/Readme.md

+12-12
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

2-
# TorchServe Llama 2 Chatapp
2+
# TorchServe Llama Chatapp
33

4-
This is an example showing how to deploy a llama2 chat app using TorchServe.
4+
This is an example showing how to deploy a Llama chat app using TorchServe.
55
We use [streamlit](https://github.com/streamlit/streamlit) to create the app
66

77
We are using [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) in this example
@@ -17,21 +17,21 @@ To get started with TorchServe, you need to run the following
1717
# 1: Set HF Token as Env variable
1818
export HUGGINGFACE_TOKEN=<Token> # get this from your HuggingFace account
1919
20-
# 2: Build TorchServe Image for Serving llama2-7b model with 4-bit quantization
21-
./examples/llm/llama2/chat_app/docker/build_image.sh meta-llama/Llama-2-7b-chat-hf
20+
# 2: Build TorchServe Chat Bot Image for Serving
21+
./examples/LLM/llama/chat_app/docker/build_image.sh
2222
2323
# 3: Launch the streamlit app for server & client
24-
docker run --rm -it --platform linux/amd64 -p 127.0.0.1:8080:8080 -p 127.0.0.1:8081:8081 -p 127.0.0.1:8082:8082 -p 127.0.0.1:8084:8084 -p 127.0.0.1:8085:8085 -v <model-store>:/home/model-server/model-store pytorch/torchserve:meta-llama---Llama-2-7b-chat-hf
24+
docker run --rm -it --platform linux/amd64 -p 127.0.0.1:8080:8080 -p 127.0.0.1:8081:8081 -p 127.0.0.1:8082:8082 -p 127.0.0.1:8084:8084 -p 127.0.0.1:8085:8085 -v <model-store>:/home/model-server/model-store -e MODEL_NAME=meta-llama/Meta-Llama-3-8B-Instruct pytorch/torchserve:chat_bot
2525
```
2626
In step 3, `<model-store>` is a location where you want the model to be downloaded
2727

2828
### What to expect
2929
This launches two streamlit apps
3030
1. TorchServe Server app to start/stop TorchServe, load model, scale up/down workers, configure dynamic batch_size ( Currently llama-cpp-python doesn't support batch_size > 1)
31-
- Since this app is targeted for Apple M1/M2 laptops, we load a 4-bit quantized version of llama2 using llama-cpp-python.
31+
- Since this app is targeted for Apple M1/M2 laptops, we load a 4-bit quantized version of llama using llama-cpp-python.
3232
2. Client chat app where you can chat with the model . There is a slider to send concurrent requests to the model. The current app doesn't have a good mechanism to show multiple responses in parallel. You can notice streaming response for the first request followed by a complete response for the next request.
3333

34-
Currently, this launches llama2-7b model with 4-bit quantization running on CPU.
34+
Currently, this launches Meta-Llama-3-8B-Instruct with 4-bit quantization running on CPU.
3535

3636
To make use of M1/M2 GPU, you can follow the below guide to do a standalone TorchServe installation.
3737

@@ -55,8 +55,8 @@ javac 17.0.8
5555
You can download it from [java](https://www.oracle.com/java/technologies/downloads/#jdk17-mac)
5656
2) Install conda with support for arm64
5757

58-
3) Since we are running this example on Mac, we will use the 7B llama2 model.
59-
Download llama2-7b weights by following instructions [here](https://github.com/pytorch/serve/tree/master/examples/large_models/Huggingface_accelerate/llama2#step-1-download-model-permission)
58+
3) Since we are running this example on Mac, we will use the Meta-Llama-3-8B-Instruct model.
59+
Download Meta-Llama-3-8B-Instruct weights by following instructions [here](https://github.com/pytorch/serve/tree/master/examples/large_models/Huggingface_accelerate/llama#step-1-download-model-permission)
6060

6161
4) Install streamlit with
6262

@@ -80,9 +80,9 @@ pip install torchserve torch-model-archiver torch-workflow-archiver
8080
Run this script to create `llamacpp.tar.gz` to be loaded in TorchServe
8181

8282
```
83-
source package_llama.sh <path to llama2 snapshot folder>
83+
source package_llama.sh <path to llama snapshot folder>
8484
```
85-
This creates the quantized weights in `$LLAMA2_WEIGHTS`
85+
This creates the quantized weights in `$LLAMA_WEIGHTS`
8686

8787
For subsequent runs, we don't need to regenerate these weights. We only need to package the handler, model-config.yaml in the tar file.
8888

@@ -97,7 +97,7 @@ You might need to run the below command if the script output indicates it.
9797
sudo xcodebuild -license
9898
```
9999

100-
The script is setting an env variable `LLAMA2_Q4_MODEL` and using this in the handler. In an actual use-case, you would set the path to the weights in `model-config.yaml`
100+
The script is setting an env variable `LLAMA_Q4_MODEL` and using this in the handler. In an actual use-case, you would set the path to the weights in `model-config.yaml`
101101

102102
```
103103
handler:

‎examples/LLM/llama2/chat_app/docker/Dockerfile ‎examples/LLM/llama/chat_app/docker/Dockerfile

+6-3
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,23 @@ ARG BASE_IMAGE=pytorch/torchserve:latest-gpu
33
FROM $BASE_IMAGE as server
44
ARG BASE_IMAGE
55
ARG EXAMPLE_DIR
6-
ARG MODEL_NAME
76
ARG HUGGINGFACE_TOKEN
87

98
USER root
109

11-
ENV MODEL_NAME=$MODEL_NAME
12-
1310
RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt \
1411
apt-get update && \
1512
apt-get install libopenmpi-dev git -y
1613

1714
COPY $EXAMPLE_DIR/requirements.txt /home/model-server/chat_bot/requirements.txt
1815
RUN pip install -r /home/model-server/chat_bot/requirements.txt && huggingface-cli login --token $HUGGINGFACE_TOKEN
1916

17+
WORKDIR /home/model-server/chat_bot
18+
RUN git clone https://github.com/ggerganov/llama.cpp.git build && \
19+
cd build && \
20+
make && \
21+
python -m pip install -r requirements.txt
22+
2023
COPY $EXAMPLE_DIR /home/model-server/chat_bot
2124
COPY $EXAMPLE_DIR/dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
2225
COPY $EXAMPLE_DIR/config.properties /home/model-server/config.properties
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
import argparse
2+
import os
3+
4+
from huggingface_hub import HfApi, snapshot_download
5+
6+
7+
def dir_path(path_str):
8+
if os.path.isdir(path_str):
9+
return path_str
10+
elif input(f"{path_str} does not exist, create directory? [y/n]").lower() == "y":
11+
os.makedirs(path_str)
12+
return path_str
13+
else:
14+
raise NotADirectoryError(path_str)
15+
16+
17+
class HFModelNotFoundError(Exception):
18+
def __init__(self, model_str):
19+
super().__init__(f"HuggingFace model not found: '{model_str}'")
20+
21+
22+
def hf_model(model_str):
23+
api = HfApi()
24+
models = [m.modelId for m in api.list_models()]
25+
if model_str in models:
26+
return model_str
27+
else:
28+
raise HFModelNotFoundError(model_str)
29+
30+
31+
parser = argparse.ArgumentParser()
32+
parser.add_argument(
33+
"--model_path",
34+
"-o",
35+
type=dir_path,
36+
default="model",
37+
help="Output directory for downloaded model files",
38+
)
39+
parser.add_argument(
40+
"--model_name", "-m", type=hf_model, required=True, help="HuggingFace model name"
41+
)
42+
parser.add_argument("--revision", "-r", type=str, default="main", help="Revision")
43+
args = parser.parse_args()
44+
45+
snapshot_path = snapshot_download(
46+
repo_id=args.model_name,
47+
revision=args.revision,
48+
cache_dir=args.model_path,
49+
use_auth_token=True,
50+
ignore_patterns=["original/*", "pytorch_model*.bin"],
51+
)
52+
53+
print(f"Files for '{args.model_name}' is downloaded to '{snapshot_path}'")
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,8 @@
11
#!/bin/bash
22

3-
# Check if there are enough arguments
4-
if [ "$#" -eq 0 ] || [ "$#" -gt 1 ]; then
5-
echo "Usage: $0 <HF Model>"
6-
exit 1
7-
fi
8-
9-
MODEL_NAME=$(echo "$1" | sed 's/\//---/g')
10-
echo "Model: " $MODEL_NAME
11-
123
BASE_IMAGE="pytorch/torchserve:latest-cpu"
134

14-
DOCKER_TAG="pytorch/torchserve:${MODEL_NAME}"
5+
DOCKER_TAG="pytorch/torchserve:chat_bot"
156

167
# Get relative path of example dir
178
EXAMPLE_DIR=$(dirname "$(readlink -f "$0")")
@@ -20,9 +11,10 @@ ROOT_DIR=$(realpath "$ROOT_DIR")
2011
EXAMPLE_DIR=$(echo "$EXAMPLE_DIR" | sed "s|$ROOT_DIR|./|")
2112

2213
# Build docker image for the application
23-
DOCKER_BUILDKIT=1 docker buildx build --platform=linux/amd64 --file ${EXAMPLE_DIR}/Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg EXAMPLE_DIR="${EXAMPLE_DIR}" --build-arg MODEL_NAME="${MODEL_NAME}" --build-arg HUGGINGFACE_TOKEN -t "${DOCKER_TAG}" .
14+
DOCKER_BUILDKIT=1 docker buildx build --platform=linux/amd64 --file ${EXAMPLE_DIR}/Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg EXAMPLE_DIR="${EXAMPLE_DIR}" --build-arg HUGGINGFACE_TOKEN -t "${DOCKER_TAG}" .
2415

2516
echo "Run the following command to start the chat bot"
2617
echo ""
27-
echo docker run --rm -it --platform linux/amd64 -p 127.0.0.1:8080:8080 -p 127.0.0.1:8081:8081 -p 127.0.0.1:8082:8082 -p 127.0.0.1:8084:8084 -p 127.0.0.1:8085:8085 -v $(pwd)/model_store_1:/home/model-server/model-store $DOCKER_TAG
18+
echo docker run --rm -it --platform linux/amd64 -p 127.0.0.1:8080:8080 -p 127.0.0.1:8081:8081 -p 127.0.0.1:8082:8082 -p 127.0.0.1:8084:8084 -p 127.0.0.1:8085:8085 -v $(pwd)/model_store_1:/home/model-server/model-store -e MODEL_NAME="meta-llama/Llama-2-7b-chat-hf" $DOCKER_TAG
2819
echo ""
20+
echo "Note: You can replace the model identifier as needed"

‎examples/LLM/llama2/chat_app/docker/client_app.py ‎examples/LLM/llama/chat_app/docker/client_app.py

+1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import streamlit as st
77

88
MODEL_NAME = os.environ["MODEL_NAME"]
9+
MODEL_NAME = MODEL_NAME.replace("/", "---")
910

1011
# App title
1112
st.set_page_config(page_title="TorchServe Chatbot")
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
#!/bin/bash
2+
set -e
3+
4+
MODEL_DIR=$(echo "$MODEL_NAME" | sed 's/\//---/g')
5+
6+
export LLAMA_Q4_MODEL=/home/model-server/model-store/$MODEL_DIR/model/ggml-model-q4_0.gguf
7+
8+
9+
create_model_cfg_yaml() {
10+
# Define the YAML content with a placeholder for the model name
11+
yaml_content="# TorchServe frontend parameters\nminWorkers: 1\nmaxWorkers: 1\nresponseTimeout: 1200\n#deviceType: \"gpu\"\n#deviceIds: [0,1]\n#torchrun:\n# nproc-per-node: 1\n\nhandler:\n model_name: \"${2}\"\n manual_seed: 40"
12+
13+
# Create the YAML file
14+
echo -e "$yaml_content" > "model-config.yaml"
15+
}
16+
17+
create_model_archive() {
18+
MODEL_DIR=$1
19+
echo "Create model archive for ${MODEL_DIR} if it doesn't already exist"
20+
if [ -d "/home/model-server/model-store/$MODEL_DIR" ]; then
21+
echo "Model archive for $MODEL_DIR exists."
22+
fi
23+
if [ -d "/home/model-server/model-store/$MODEL_DIR/model" ]; then
24+
echo "Model already download"
25+
mv /home/model-server/model-store/$MODEL_DIR/model /home/model-server/model-store/
26+
else
27+
echo "Model needs to be downloaded"
28+
fi
29+
torch-model-archiver --model-name "$MODEL_DIR" --version 1.0 --handler llama_cpp_handler.py --config-file "model-config.yaml" -r requirements.txt --archive-format no-archive --export-path /home/model-server/model-store -f
30+
if [ -d "/home/model-server/model-store/model" ]; then
31+
mv /home/model-server/model-store/model /home/model-server/model-store/$MODEL_DIR/
32+
fi
33+
}
34+
35+
download_model() {
36+
MODEL_DIR=$1
37+
MODEL_NAME=$2
38+
if [ -d "/home/model-server/model-store/$MODEL_DIR/model" ]; then
39+
echo "Model $MODEL_NAME already downloaded"
40+
else
41+
echo "Downloading model $MODEL_NAME"
42+
python Download_model.py --model_path /home/model-server/model-store/$MODEL_DIR/model --model_name $MODEL_NAME
43+
fi
44+
}
45+
46+
quantize_model() {
47+
if [ ! -f "$LLAMA_Q4_MODEL" ]; then
48+
tmp_model_name=$(echo "$MODEL_DIR" | sed 's/---/--/g')
49+
directory_path=/home/model-server/model-store/$MODEL_DIR/model/models--$tmp_model_name/snapshots/
50+
HF_MODEL_SNAPSHOT=$(find $directory_path -type d -mindepth 1)
51+
cd build
52+
53+
echo "Convert the model to ggml FP16 format"
54+
if [[ $MODEL_NAME == *"Meta-Llama-3"* ]]; then
55+
python convert.py $HF_MODEL_SNAPSHOT --vocab-type bpe,hfft --outfile ggml-model-f16.gguf
56+
else
57+
python convert.py $HF_MODEL_SNAPSHOT --outfile ggml-model-f16.gguf
58+
fi
59+
60+
echo "Quantize the model to 4-bits (using q4_0 method)"
61+
./quantize ggml-model-f16.gguf $LLAMA_Q4_MODEL q4_0
62+
63+
cd ..
64+
echo "Saved quantized model weights to $LLAMA_Q4_MODEL"
65+
fi
66+
}
67+
68+
if [[ "$1" = "serve" ]]; then
69+
shift 1
70+
create_model_cfg_yaml $MODEL_DIR $MODEL_NAME
71+
create_model_archive $MODEL_DIR
72+
download_model $MODEL_DIR $MODEL_NAME
73+
quantize_model
74+
streamlit run torchserve_server_app.py --server.port 8084 &
75+
streamlit run client_app.py --server.port 8085
76+
else
77+
eval "$@"
78+
fi
79+
80+
# prevent docker exit
81+
tail -f /dev/null

‎examples/LLM/llama2/chat_app/docker/llama_cpp_handler.py ‎examples/LLM/llama/chat_app/docker/llama_cpp_handler.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def initialize(self, ctx):
2323
ctx (context): It is a JSON Object containing information
2424
pertaining to the model artifacts parameters.
2525
"""
26-
model_path = os.environ["LLAMA2_Q4_MODEL"]
26+
model_path = os.environ["LLAMA_Q4_MODEL"]
2727
model_name = ctx.model_yaml_config["handler"]["model_name"]
2828
seed = int(ctx.model_yaml_config["handler"]["manual_seed"])
2929
torch.manual_seed(seed)

‎examples/LLM/llama2/chat_app/docker/torchserve_server_app.py ‎examples/LLM/llama/chat_app/docker/torchserve_server_app.py

+1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
import streamlit as st
88

99
MODEL_NAME = os.environ["MODEL_NAME"]
10+
MODEL_NAME = MODEL_NAME.replace("/", "---")
1011
MODEL = MODEL_NAME.split("---")[1]
1112

1213
# App title

‎examples/LLM/llama2/chat_app/package_llama.sh ‎examples/LLM/llama/chat_app/package_llama.sh

+14-11
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
# Check if the argument is empty or unset
33
if [ -z "$1" ]; then
44
echo "Missing Mandatory argument: Path to llama weights"
5-
echo "Usage: ./package_llama.sh ./model/models--meta-llama--Llama-2-7b-chat-hf/snapshots/08751db2aca9bf2f7f80d2e516117a53d7450235"
5+
echo "Usage: ./package_llama.sh ./models--meta-llama--Meta-Llama-3-8B-Instruct/snapshots/e5e23bbe8e749ef0efcf16cad411a7d23bd23298"
66
exit 1
77
fi
88

99
MODEL_GENERATION="true"
10-
LLAMA2_WEIGHTS="$1"
10+
LLAMA_WEIGHTS="$1"
1111

1212
if [ -n "$2" ]; then
1313
MODEL_GENERATION="$2"
@@ -20,18 +20,22 @@ if [ "$MODEL_GENERATION" = "true" ]; then
2020
rm -rf build
2121
git clone https://github.com/ggerganov/llama.cpp.git build
2222
cd build
23-
make
23+
make
2424
python -m pip install -r requirements.txt
25-
26-
echo "Convert the 7B model to ggml FP16 format"
27-
python convert.py $LLAMA2_WEIGHTS --outfile ggml-model-f16.gguf
28-
25+
26+
echo "Convert the model to ggml FP16 format"
27+
if [[ $MODEL_NAME == *"Meta-Llama-3"* ]]; then
28+
python convert.py $HF_MODEL_SNAPSHOT --vocab-type bpe,hfft --outfile ggml-model-f16.gguf
29+
else
30+
python convert.py $HF_MODEL_SNAPSHOT --outfile ggml-model-f16.gguf
31+
fi
32+
2933
echo "Quantize the model to 4-bits (using q4_0 method)"
3034
./quantize ggml-model-f16.gguf ../ggml-model-q4_0.gguf q4_0
31-
35+
3236
cd ..
33-
export LLAMA2_Q4_MODEL=$PWD/ggml-model-q4_0.gguf
34-
echo "Saved quantized model weights to $LLAMA2_Q4_MODEL"
37+
export LLAMA_Q4_MODEL=$PWD/ggml-model-q4_0.gguf
38+
echo "Saved quantized model weights to $LLAMA_Q4_MODEL"
3539
fi
3640

3741
echo "Creating torchserve model archive"
@@ -43,4 +47,3 @@ if [ "$MODEL_GENERATION" = "true" ]; then
4347
echo "Cleaning up build of llama-cpp"
4448
rm -rf build
4549
fi
46-
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
streamlit>=1.26.0
File renamed without changes.

‎examples/LLM/llama2/chat_app/docker/dockerd-entrypoint.sh

-81
This file was deleted.

‎examples/LLM/llama2/chat_app/requirements.txt

-1
This file was deleted.

0 commit comments

Comments
 (0)
Please sign in to comment.