Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Multi-Image generation usecase app #3356

Merged
merged 45 commits into from
Nov 23, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f81b953
Added OV SDXL registration to chat_bot app
ravi9 Jun 14, 2024
13a8b75
sdxl image generation
Jun 19, 2024
c11657b
pass model params
Jun 21, 2024
3772fa6
fixes
Jun 24, 2024
6cb2fc0
fixes
Jun 24, 2024
06f9672
llm-sd pipeline
Jun 25, 2024
eb6e81f
store images
Jun 25, 2024
aa1147e
need to fix sd_xl checkbox
Jun 28, 2024
a9cae48
fix for num_of_img==1
Jul 3, 2024
49a16bf
fix for 1 img, total time
likholat Jul 3, 2024
0d2be45
perf fixes
likholat Jul 10, 2024
fe5a755
fixes
likholat Jul 12, 2024
a1a0ae9
llm with torch.compile
Jul 23, 2024
8bb9cd2
fixed tocken auth issue, ui fixes
Jul 24, 2024
2ec5568
gpt fast version, bad quality of output prompts
Aug 16, 2024
721d926
rm extra files, updated readme
Aug 16, 2024
fcdf5f3
added llama params, sd default res 768, better prompts
Aug 29, 2024
a1ec7a1
fix, updated default workers num
Sep 4, 2024
9d1da57
button for prompts generation
Sep 5, 2024
aff7b39
fix
likholat Sep 5, 2024
244c3de
fix
likholat Sep 5, 2024
8e69626
Changed SDXL to LCM SDXL
suryasidd Sep 9, 2024
36af2a7
updated lcm example
ravi9 Sep 12, 2024
62f3360
updated lcm example
ravi9 Sep 12, 2024
9d68037
updated lcm example
ravi9 Sep 12, 2024
3a4f39e
Merge branch 'pytorch:master' into surya/lcm_sdxl
ravi9 Oct 22, 2024
e2bb145
Merge branch 'pytorch:master' into surya/lcm_sdxl
ravi9 Nov 3, 2024
b2a1e2c
add llm_sd_app
ravi9 Nov 4, 2024
bd11e71
Updated llm_diffusion_serving_app
ravi9 Nov 4, 2024
17e531f
Updated llm_diffusion_serving_app
ravi9 Nov 4, 2024
c346a1f
Update llm_diffusion_serving_app
ravi9 Nov 4, 2024
0d965e6
Update llm_diffusion_serving_app
ravi9 Nov 4, 2024
db0fd66
Update examples/usecases/llm_diffusion_serving_app/Readme.md
ravi9 Nov 5, 2024
300adeb
Update llm_diffusion_serving_app
ravi9 Nov 7, 2024
604dfde
update llm_diffusion_serving_app
ravi9 Nov 12, 2024
bfa3fda
update llm_diffusion_serving_app
ravi9 Nov 14, 2024
5dd9b00
update llm_diffusion_serving_app
ravi9 Nov 14, 2024
4713829
Update llm_diffusion_serving_app
ravi9 Nov 14, 2024
3c1114a
Update llm_diffusion_serving_app
ravi9 Nov 14, 2024
a0bd985
Minor Updates, Added sd_benchmark
ravi9 Nov 20, 2024
57842e3
Add docs for llm_diffusion_serving_app
ravi9 Nov 20, 2024
ba4a2fe
Apply suggestions from code review
ravi9 Nov 21, 2024
28e1b53
Update llm_diffusion_serving_app, fix linter issues
ravi9 Nov 23, 2024
b59670b
Update img, add assests
ravi9 Nov 23, 2024
12d86ab
update readme
ravi9 Nov 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions examples/usecases/llm_diffusion_serving_app/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@

# Multi-Image Generation App with Streamlit, Llama, Stable Diffusion, OpenVINO, TorchServe

This Streamlit app is designed to generate multiple images based on a provided text prompt. It leverages [TorchServe](https://pytorch.org/serve/) for efficient model serving and management, and utilizes [Meta-LLaMA-3.2](https://huggingface.co/meta-llama) for prompt generation, and **Stable Diffusion** with [latent-consistency/lcm-sdxl](https://huggingface.co/latent-consistency/lcm-sdxl) and [Torch.compile using OpenVINO backend](https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html) for image generation.

![Multi-Image Generation App Workflow](./docker/workflow-1.png)

## Quick Start Guide

**Prerequisites**:
- Docker installed on your system
- Hugging Face Token: Create a Hugging Face account and obtain a token with access to the [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct) model.


To launch the app, you need to run the following:
```bash
# 1: Set HF Token as Env variable
export HUGGINGFACE_TOKEN=<HUGGINGFACE_TOKEN>

# 2: Build Docker image of this Multi-Image Generation App
git clone https://github.com/pytorch/serve.git
cd serve
./examples/usecases/llm_diffusion_serving_app/docker/build_image.sh

# 3: Launch the streamlit app for server & client
# After the Docker build is successful, you will see a command printed to start the app. Run that command to launch the Streamlit app for both the server and client.
```

#### Sample Output:
```console
ubuntu@ip-10-0-0-137:~/serve$ ./examples/usecases/llm_diffusion_serving_app/docker/build_image.sh
EXAMPLE_DIR: .//examples/usecases/llm_diffusion_serving_app/docker
ROOT_DIR: /home/ubuntu/serve
DOCKER_BUILDKIT=1 docker buildx build --platform=linux/amd64 --file .//examples/usecases/llm_diffusion_serving_app/docker/Dockerfile --build-arg BASE_IMAGE="pytorch/torchserve:latest-cpu" --build-arg EXAMPLE_DIR=".//examples/usecases/llm_diffusion_serving_app/docker" --build-arg HUGGINGFACE_TOKEN=hf_<token> --build-arg HTTP_PROXY= --build-arg HTTPS_PROXY= --build-arg NO_PROXY= -t "pytorch/torchserve:llm_diffusion_serving_app" .
[+] Building 1.4s (18/18) FINISHED docker:default
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 2.33kB 0.0s
=> [internal] load metadata for docker.io/pytorch/torchserve:latest-cpu 0.2s
=> [server 1/13] FROM docker.io/pytorch/torchserve:latest-cpu@sha256:50e189492f630a56214dce45ec6fd8db3ad45845890c0e8c26b469c6b06ca4fe 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 3.54kB 0.0s
=> CACHED [server 2/13] RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt apt-get update && apt-get install libopenmpi-dev git -y 0.0s
=> CACHED [server 3/13] WORKDIR /home/model-server/ 0.0s
=> CACHED [server 4/13] COPY .//examples/usecases/llm_diffusion_serving_app/docker/requirements.txt /home/model-server/requirements.txt 0.0s
=> CACHED [server 5/13] RUN pip install -r requirements.txt 0.0s
=> CACHED [server 6/13] COPY .//examples/usecases/llm_diffusion_serving_app/docker/sd/requirements.txt /home/model-server/sd_requirements.txt 0.0s
=> CACHED [server 7/13] RUN pip install -r sd_requirements.txt 0.0s
=> [server 8/13] COPY .//examples/usecases/llm_diffusion_serving_app/docker /home/model-server/llm_diffusion_serving_app/ 0.0s
=> [server 9/13] RUN --mount=type=secret,id=hf_token huggingface-cli login --token hf_<token> 0.5s
=> [server 10/13] WORKDIR /home/model-server/llm_diffusion_serving_app 0.0s
=> [server 11/13] COPY .//examples/usecases/llm_diffusion_serving_app/docker/dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh 0.0s
=> [server 12/13] COPY .//examples/usecases/llm_diffusion_serving_app/docker/config.properties /home/model-server/config.properties 0.0s
=> [server 13/13] RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh && chown -R model-server /home/model-server 0.3s
=> exporting to image 0.1s
=> => exporting layers 0.1s
=> => writing image sha256:e900f12e6dad3ec443966766f82860427fa066aefe504a415eecf69bf4c3c043 0.0s
=> => naming to docker.io/pytorch/torchserve:llm_diffusion_serving_app 0.0s

Run the following command to start the Multi-image generation App

docker run --rm -it --platform linux/amd64 \
-p 127.0.0.1:8080:8080 \
-p 127.0.0.1:8081:8081 \
-p 127.0.0.1:8082:8082 \
-p 127.0.0.1:8084:8084 \
-p 127.0.0.1:8085:8085 \
-v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
-e MODEL_NAME_LLM=meta-llama/Llama-3.2-3B-Instruct \
-e MODEL_NAME_SD=stabilityai/stable-diffusion-xl-base-1.0 \
pytorch/torchserve:llm_diffusion_serving_app

Note: You can replace the model identifier as needed
```

## What to expect
Once you launch using the the docker run cmd, it launches two streamlit apps:
1. TorchServe Server App (running at http://localhost:8084) to start/stop TorchServe, load/register models, scale up/down workers.
2. Client App (running at http://localhost:8085) where you can enter prompt for Image generation.

#### Sample Output:

```console
ubuntu@ip-10-0-0-137:~/serve$ docker run --rm -it --platform linux/amd64 \
-p 127.0.0.1:8080:8080 \
-p 127.0.0.1:8081:8081 \
-p 127.0.0.1:8082:8082 \
-p 127.0.0.1:8084:8084 \
-p 127.0.0.1:8085:8085 \
-v /home/ubuntu/serve/model-store-local:/home/model-server/model-store \
-e MODEL_NAME_LLM=meta-llama/Llama-3.2-3B-Instruct \
-e MODEL_NAME_SD=stabilityai/stable-diffusion-xl-base-1.0 \
pytorch/torchserve:llm_diffusion_serving_app

Preparing meta-llama/Llama-3.2-1B-Instruct
/home/model-server/llm_diffusion_serving_app/llm /home/model-server/llm_diffusion_serving_app
Model meta-llama---Llama-3.2-1B-Instruct already downloaded.
Model archive for meta-llama---Llama-3.2-1B-Instruct exists.
/home/model-server/llm_diffusion_serving_app

Preparing stabilityai/stable-diffusion-xl-base-1.0
/home/model-server/llm_diffusion_serving_app/sd /home/model-server/llm_diffusion_serving_app
Model stabilityai/stable-diffusion-xl-base-1.0 already downloaded
Model archive for stabilityai---stable-diffusion-xl-base-1.0 exists.
/home/model-server/llm_diffusion_serving_app

Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.


Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.


You can now view your Streamlit app in your browser.

Local URL: http://localhost:8085
Network URL: http://123.11.0.2:8085
External URL: http://123.123.12.34:8085


You can now view your Streamlit app in your browser.

Local URL: http://localhost:8084
Network URL: http://123.11.0.2:8084
External URL: http://123.123.12.34:8084
```
49 changes: 49 additions & 0 deletions examples/usecases/llm_diffusion_serving_app/docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Use multi-stage build with PyTorch Serve base image
ARG BASE_IMAGE=pytorch/torchserve:latest-cpu
FROM $BASE_IMAGE AS server
# Build arguments
ARG EXAMPLE_DIR
ARG HUGGINGFACE_TOKEN
ARG HTTP_PROXY
ARG HTTPS_PROXY
ARG NO_PROXY

# Set proxy environment variables if provided
ENV http_proxy=${HTTP_PROXY}
ENV https_proxy=${HTTPS_PROXY}
ENV no_proxy=${NO_PROXY}
ENV TS_DISABLE_TOKEN_AUTHORIZATION=true

# Switch to root for installation
USER root

RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt \
apt-get update && \
apt-get install libopenmpi-dev git -y

WORKDIR /home/model-server/

# Copy and install main requirements
COPY ${EXAMPLE_DIR}/requirements.txt /home/model-server/requirements.txt
# RUN pip install --no-cache-dir -r requirements.txt
RUN pip install -r requirements.txt

# Copy and install SD-specific requirements
COPY ${EXAMPLE_DIR}/sd/requirements.txt /home/model-server/sd_requirements.txt
# RUN pip install --no-cache-dir -r sd_requirements.txt
RUN pip install -r sd_requirements.txt

# Copy application code
COPY ${EXAMPLE_DIR} /home/model-server/llm_diffusion_serving_app/

# Login to Hugging Face
RUN --mount=type=secret,id=hf_token \
huggingface-cli login --token ${HUGGINGFACE_TOKEN}

WORKDIR /home/model-server/llm_diffusion_serving_app

COPY $EXAMPLE_DIR/dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
COPY $EXAMPLE_DIR/config.properties /home/model-server/config.properties

RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh \
&& chown -R model-server /home/model-server
55 changes: 55 additions & 0 deletions examples/usecases/llm_diffusion_serving_app/docker/build_image.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/bin/bash
set -e
BASE_IMAGE="pytorch/torchserve:latest-cpu"

DOCKER_TAG="pytorch/torchserve:llm_diffusion_serving_app"

# LLM_HF_ID=meta-llama/Meta-Llama-3-8B
# LLM_HF_ID=meta-llama/Llama-3.2-1B-Instruct
LLM_HF_ID=meta-llama/Llama-3.2-3B-Instruct
SD_HF_ID=stabilityai/stable-diffusion-xl-base-1.0

# Get relative path of example dir
EXAMPLE_DIR=$(dirname "$(readlink -f "$0")")
ROOT_DIR=${EXAMPLE_DIR}/../../../../
ROOT_DIR=$(realpath "$ROOT_DIR")
EXAMPLE_DIR=$(echo "$EXAMPLE_DIR" | sed "s|$ROOT_DIR|./|")

echo "EXAMPLE_DIR: $EXAMPLE_DIR"
echo "ROOT_DIR: $ROOT_DIR"

# Build docker image for the application
docker_build_cmd="DOCKER_BUILDKIT=1 \
docker buildx build \
--platform=linux/amd64 \
--file ${EXAMPLE_DIR}/Dockerfile \
--build-arg BASE_IMAGE=\"${BASE_IMAGE}\" \
--build-arg EXAMPLE_DIR=\"${EXAMPLE_DIR}\" \
--build-arg HUGGINGFACE_TOKEN=${HUGGINGFACE_TOKEN} \
--build-arg HTTP_PROXY=$http_proxy \
--build-arg HTTPS_PROXY=$https_proxy \
--build-arg NO_PROXY=$no_proxy \
-t \"${DOCKER_TAG}\" ."

echo -e "$docker_build_cmd"

eval $docker_build_cmd

mkdir -p model-store-local

echo ""
echo "Run the following command to start the Multi-image generation App"
echo ""
echo "docker run --rm -it --platform linux/amd64 \\
-p 127.0.0.1:8080:8080 \\
-p 127.0.0.1:8081:8081 \\
-p 127.0.0.1:8082:8082 \\
-p 127.0.0.1:8084:8084 \\
-p 127.0.0.1:8085:8085 \\
-v ${PWD}/model-store-local:/home/model-server/model-store \\
-e MODEL_NAME_LLM=${LLM_HF_ID} \\
-e MODEL_NAME_SD=${SD_HF_ID} \\
$DOCKER_TAG"

echo ""
echo "Note: You can replace the model identifier as needed"
Loading
Loading