update PyTorch 2.x examples to use PyTorch >=2.3 (#3111)

agunapal · mreso · web-flow · commit da2a2e005e51 · 2024-04-25T18:09:56.000Z
* Updated SAM Fast and aot_compile example

* Updated Diffusion Fast example

* Updated GPT Fast example

* Updated GPT Fast example

---------

Co-authored-by: Matthias Reso &lt;13337103+mreso@users.noreply.github.com&gt;
diff --git a/examples/large_models/diffusion_fast/README.md b/examples/large_models/diffusion_fast/README.md
@@ -21,8 +21,7 @@ The example has been tested on A10, A100 as well as H100.
 Install dependencies and upgrade torch to nightly build (currently required)
 ```
 git clone https://github.com/huggingface/diffusion-fast.git
-pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121 --ignore-installed -y
-pip install accelerate transformers peft
+pip install accelerate transformers diffusers peft
 pip install --no-cache-dir git+https://github.com/pytorch-labs/ao@54bcd5a10d0abbe7b0c045052029257099f83fd9
 pip install pandas matplotlib seaborn
 ```
diff --git a/examples/large_models/gpt_fast/README.md b/examples/large_models/gpt_fast/README.md
@@ -18,20 +18,18 @@ The examples has been tested on A10, A100 as well as H100.
 
 #### Pre-requisites
 
+- PyTorch 2.3
+- CUDA >= 11.8
+
 `cd` to the example folder `examples/large_models/gpt_fast`
 
-Install dependencies and upgrade torch to nightly build (currently required)
+Install dependencies
 ```
 git clone https://github.com/pytorch-labs/gpt-fast/
+cd gpt-fast
 git checkout f44ef4eb55b54ec4c452b669eee409421adabd60
 pip install sentencepiece huggingface_hub
-pip uninstall torchtext torchdata torch torchvision torchaudio -y
-pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 --ignore-installed
-```
-
-You can also install PyTorch nightlies using the below command
-```
-python ./ts_scripts/install_dependencies.py --cuda=cu121 --nightly_torch
+cd ..
 ```
 
 ### Step 1: Download  and convert the weights
diff --git a/examples/large_models/gpt_fast/model_config.yaml b/examples/large_models/gpt_fast/model_config.yaml
@@ -5,7 +5,7 @@ maxBatchDelay: 200
 responseTimeout: 300
 deviceType: "gpu"
 handler:
-    converted_ckpt_dir: "checkpoints/meta-llama/Llama-2-7b-hf/model.pth"
+    converted_ckpt_dir: "checkpoints/meta-llama/Llama-2-7b-chat-hf/model.pth"
     max_new_tokens: 50
     compile: true
     fx_graph_cache: True
diff --git a/examples/large_models/segment_anything_fast/README.md b/examples/large_models/segment_anything_fast/README.md
@@ -15,7 +15,8 @@ Details on how this is achieved can be found in this [blog](https://pytorch.org/
 
 #### Pre-requisites
 
-Needs python 3.10
+- Needs python 3.10
+- PyTorch >= 2.3.0
 
 `cd` to the example folder `examples/large_models/segment_anything_fast`
 
@@ -24,8 +25,6 @@ Install `Segment Anything Fast` by running
 chmod +x install_segment_anything_fast.sh
 source install_segment_anything_fast.sh
 ```
-Segment Anything Fast needs the nightly version of PyTorch. Hence the script is uninstalling PyTorch, its domain libraries and installing the nightly version of PyTorch.
-
 
 ### Step 1: Download the weights
 
@@ -47,26 +46,21 @@ Example:
   - For `A100` : `process_batch_size=16`
 
 
-### Step 2: Generate mar or tgz file
-
-```
-torch-model-archiver --model-name sam-fast --version 1.0 --handler custom_handler.py --config-file model-config.yaml --archive-format tgz
-```
-
-### Step 3: Add the tgz file to model store
+### Step 2: Generate model archive
 
 ```
 mkdir model_store
-mv sam-fast.tar.gz model_store
+torch-model-archiver --model-name sam-fast --version 1.0 --handler custom_handler.py --config-file model-config.yaml --archive-format no-archive  --export-path model_store -f
+mv sam_vit_h_4b8939.pth model_store/sam-fast/
 ```
 
-### Step 4: Start torchserve
+### Step 3: Start torchserve
 
 ```
-torchserve --start --ncs --model-store model_store --models sam-fast.tar.gz
+torchserve --start --ncs --model-store model_store --models sam-fast
 ```
 
-### Step 5: Run inference
+### Step 4: Run inference
 
 ```
 python inference.py
diff --git a/examples/large_models/segment_anything_fast/custom_handler.py b/examples/large_models/segment_anything_fast/custom_handler.py
@@ -1,6 +1,7 @@
 import base64
 import io
 import logging
+import os
 import pickle
 
 import cv2
@@ -23,6 +24,7 @@ def __init__(self):
 
     def initialize(self, ctx):
         properties = ctx.system_properties
+        model_dir = properties.get("model_dir")
         self.device = "cpu"
         if torch.cuda.is_available() and properties.get("gpu_id") is not None:
             self.map_location = "cuda"
@@ -32,7 +34,9 @@ def initialize(self, ctx):
             torch.cuda.set_device(self.device)
 
         model_type = ctx.model_yaml_config["handler"]["model_type"]
-        sam_checkpoint = ctx.model_yaml_config["handler"]["sam_checkpoint"]
+        sam_checkpoint = os.path.join(
+            model_dir, ctx.model_yaml_config["handler"]["sam_checkpoint"]
+        )
         process_batch_size = ctx.model_yaml_config["handler"]["process_batch_size"]
 
         self.model = sam_model_fast_registry[model_type](checkpoint=sam_checkpoint)
diff --git a/examples/large_models/segment_anything_fast/install_segment_anything_fast.sh b/examples/large_models/segment_anything_fast/install_segment_anything_fast.sh
@@ -1,17 +1,5 @@
 #!/bin/bash
 
-# Uninstall torchtext, torchdata, torch, torchvision, and torchaudio
-pip uninstall torchtext torchdata torch torchvision torchaudio -y
-
-# Install nightly PyTorch and torchvision from the specified index URL
-pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 --ignore-installed
-
-# Optional: Display the installed PyTorch and torchvision versions
-python -c "import torch; print('PyTorch version:', torch.__version__)"
-python -c "import torchvision; print('torchvision version:', torchvision.__version__)"
-
-echo "PyTorch and torchvision updated successfully!"
-
 # Install the segment-anything-fast package from GitHub
 pip install git+https://github.com/pytorch-labs/segment-anything-fast.git
 
diff --git a/examples/large_models/segment_anything_fast/model-config.yaml b/examples/large_models/segment_anything_fast/model-config.yaml
@@ -2,5 +2,5 @@ responseTimeout: 300
 handler:
     profile: true
     model_type: "vit_h"
-    sam_checkpoint: "/home/ubuntu/serve/examples/large_models/segment_anything_fast/sam_vit_h_4b8939.pth"
+    sam_checkpoint: "sam_vit_h_4b8939.pth"
     process_batch_size: 8
diff --git a/examples/pt2/README.md b/examples/pt2/README.md
@@ -1,6 +1,6 @@
 ## PyTorch 2.x integration
 
-PyTorch 2.0 brings more compiler options to PyTorch, for you that should mean better perf either in the form of lower latency or lower memory consumption.
+PyTorch 2.x brings more compiler options to PyTorch, for you that should mean better perf either in the form of lower latency or lower memory consumption.
 
 We strongly recommend you leverage newer hardware so for GPUs that would be an Ampere architecture. You'll get even more benefits from using server GPU deployments like A10G and A100 vs consumer cards. But you should expect to see some speedups for any Volta or Ampere architecture.
 
@@ -16,7 +16,7 @@ pip install torchserve-nightly torch-model-archiver-nightly
 
 ## torch.compile
 
-PyTorch 2.0 supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging
+PyTorch 2.x supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging
 
 ```yaml
 pt2: "inductor"
diff --git a/examples/pt2/torch_export_aot_compile/README.md b/examples/pt2/torch_export_aot_compile/README.md
@@ -7,24 +7,12 @@ To understand when to use `torch._export.aot_compile`, please refer to this [sec
 
 ### Pre-requisites
 
-- `PyTorch >= 2.3.0` (or PyTorch nightlies)
-- `CUDA 12.1`
+- `PyTorch >= 2.3.0`
+- `CUDA  >= 11.8`
 
 Change directory to the examples directory
 Ex:  `cd  examples/pt2/torch_export_aot_compile`
 
-Install PyTorch 2.3 nightlies by running
-```
-chmod +x install_pytorch_nightlies.sh
-source install_pytorch_nightlies.sh
-```
-
-You can also achieve this by installing TorchServe dependencies with the `nightly_torch` flag
-```
-python ts_scripts/install_dependencies.py --cuda=cu121 --nightly_torch
-```
-
-
 ### Create a Torch exported model with AOTInductor
 
 The model is saved with `.so` extension
diff --git a/examples/pt2/torch_export_aot_compile/install_pytorch_nightlies.sh b/examples/pt2/torch_export_aot_compile/install_pytorch_nightlies.sh