updated torch.compile example

agunapal · agunapal · commit 7652c1ed0a7a · 2024-01-31T19:38:52.000Z
diff --git a/examples/pt2/README.md b/examples/pt2/README.md
@@ -1,16 +1,17 @@
 ## PyTorch 2.x integration
 
-PyTorch 2.0 brings more compiler options to PyTorch, for you that should mean better perf either in the form of lower latency or lower memory consumption. Integrating PyTorch 2.0 is fairly trivial but for now the support will be experimental given that most public benchmarks have focused on training instead of inference.
+PyTorch 2.0 brings more compiler options to PyTorch, for you that should mean better perf either in the form of lower latency or lower memory consumption.
 
 We strongly recommend you leverage newer hardware so for GPUs that would be an Ampere architecture. You'll get even more benefits from using server GPU deployments like A10G and A100 vs consumer cards. But you should expect to see some speedups for any Volta or Ampere architecture.
 
 ## Get started
 
 Install torchserve and ensure that you're using at least `torch>=2.0.0`
 
+To use the latest nightlies, you can run the following commands
 ```sh
-python ts_scripts/install_dependencies.py --cuda=cu118
-pip install torchserve torch-model-archiver
+python ts_scripts/install_dependencies.py --cuda=cu121 --nightly_torch
+pip install torchserve-nightly torch-model-archiver-nightly
 ```
 
 ## torch.compile
@@ -27,13 +28,7 @@ You can also pass a dictionary with compile options if you need more control ove
 pt2 : {backend: inductor, mode: reduce-overhead}
 ```
 
-As an example let's expand our getting started guide with the only difference being passing in the extra `model_config.yaml` file
-
-```
-mkdir model_store
-torch-model-archiver --model-name densenet161 --version 1.0 --model-file ./serve/examples/image_classifier/densenet_161/model.py --export-path model_store --extra-files ./serve/examples/image_classifier/index_to_name.json --handler image_classifier --config-file model_config.yaml
-torchserve --start --ncs --model-store model_store --models densenet161.mar
-```
+An example of using `torch.compile` can be found [here](./torch_compile/README.md)
 
 The exact same approach works with any other model, what's going on is the below
 
diff --git a/examples/pt2/torch_compile/README.md b/examples/pt2/torch_compile/README.md
@@ -0,0 +1,54 @@
+
+# TorchServe inference with torch.compile of densnet161 model
+
+This example shows how to take eager model of `densnet161`, configure TorchServe to use `torch.compile` and run inference using `torch.compile`
+
+
+### Pre-requisites
+
+- `PyTorch >= 2.0`
+
+Change directory to the examples directory
+Ex:  `cd  examples/pt2/torch_compile`
+
+
+### torch.compile config
+
+`torch.compile` supports a variety of config and the performance you get can vary based on the config. You can find the various options [here](https://pytorch.org/docs/stable/generated/torch.compile.html)
+
+In this example , we use the following config
+
+```yaml
+pt2 : {backend: inductor, mode: reduce-overhead}
+```
+
+### Create model archive
+
+```
+wget https://download.pytorch.org/models/densenet161-8d451a50.pth
+mkdir model_store
+torch-model-archiver --model-name densenet161 --version 1.0 --model-file ../../image_classifier/densenet_161/model.py --serialized-file densenet161-8d451a50.pth --export-path model_store --extra-files ../../image_classifier/index_to_name.json --handler image_classifier --config-file model_config.yaml -f
+```
+
+#### Start TorchServe
+```
+torchserve --start --ncs --model-store model_store --models densenet161.mar
+```
+
+#### Run Inference
+
+```
+curl http://127.0.0.1:8080/predictions/densenet161 -T ../../image_classifier/kitten.jpg
+```
+
+produces the output
+
+```
+{
+  "tabby": 0.4664836823940277,
+  "tiger_cat": 0.4645617604255676,
+  "Egyptian_cat": 0.06619937717914581,
+  "lynx": 0.0012969186063855886,
+  "plastic_bag": 0.00022856894065625966
+}
+```
diff --git a/examples/pt2/torch_compile/model_config.yaml b/examples/pt2/torch_compile/model_config.yaml
@@ -0,0 +1 @@
+pt2 : {backend: inductor, mode: reduce-overhead}

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+pt2 : {backend: inductor, mode: reduce-overhead}`