Skip to content

Commit d29059f

Browse files
authored
torch compile config standardization update (#3166)
* torch.compile config update * torch.compile config update * yaml test files * yaml test files * Fixed regression failure * Fixed regression failure * Fixed regression failure * Workaround for regression failure * Workaround for regression failure * Workaround for regression failure * skipping torchtext test * Update test_example_torch_compile.py * Update test_torch_compile.py * Rename toy_model.py to model.py * Update test_torch_compile.py * Update test_torch_compile.py * :Addressed review comments * Addressed review comments
1 parent 3d17a94 commit d29059f

File tree

14 files changed

+198
-44
lines changed

14 files changed

+198
-44
lines changed

examples/image_classifier/resnet_18/README.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,11 @@ Ex: `cd examples/image_classifier/resnet_18`
2323
In this example , we use the following config
2424

2525
```
26-
echo "pt2 : {backend: inductor, mode: reduce-overhead}" > model-config.yaml
26+
echo "pt2:
27+
compile:
28+
enable: True
29+
backend: inductor
30+
mode: reduce-overhead" > model-config.yaml
2731
```
2832

2933
##### Sample commands to create a Resnet18 torch.compile model archive, register it on TorchServe and run image prediction

examples/pt2/README.md

+10-4
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,22 @@ pip install torchserve-nightly torch-model-archiver-nightly
1616

1717
## torch.compile
1818

19-
PyTorch 2.x supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging
19+
PyTorch 2.x supports several compiler backends and you pick which one you want by passing in an optional file `model_config.yaml` during your model packaging. The default backend with the below minimum config is `inductor`
2020

2121
```yaml
22-
pt2: "inductor"
22+
pt2:
23+
compile:
24+
enable: True
2325
```
2426
25-
You can also pass a dictionary with compile options if you need more control over torch.compile:
27+
You can also pass various compile options if you need more control over torch.compile:
2628
2729
```yaml
28-
pt2 : {backend: inductor, mode: reduce-overhead}
30+
pt2:
31+
compile:
32+
enable: True
33+
backend: inductor
34+
mode: reduce-overhead
2935
```
3036
3137
An example of using `torch.compile` can be found [here](./torch_compile/README.md)

examples/pt2/torch_compile/README.md

+10-4
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,9 @@ Ex: `cd examples/pt2/torch_compile`
1919
In this example , we use the following config
2020

2121
```
22-
echo "pt2 : {backend: inductor, mode: reduce-overhead}" > model-config.yaml
22+
echo "pt2:
23+
compile:
24+
enable: True" > model-config.yaml
2325
```
2426

2527
### Create model archive
@@ -76,9 +78,13 @@ After a few iterations of warmup, we see the following
7678
#### Measure inference time with `torch.compile`
7779

7880
```
79-
echo "pt2: {backend: inductor, mode: reduce-overhead}" > model-config.yaml && \
80-
echo "handler:" >> model-config.yaml && \
81-
echo " profile: true" >> model-config.yaml
81+
echo "pt2:
82+
compile:
83+
enable: True
84+
backend: inductor
85+
mode: reduce-overhead" > model-config.yaml && \
86+
echo "handler:
87+
profile: true" >> model-config.yaml
8288
```
8389

8490
Once the `yaml` file is updated, create the model-archive, start TorchServe and run inference using the steps shown above.
Original file line numberDiff line numberDiff line change
@@ -1 +1,5 @@
1-
pt2 : {backend: inductor, mode: reduce-overhead}
1+
pt2:
2+
compile:
3+
enable: True
4+
backend: inductor
5+
mode: reduce-overhead

examples/pt2/torch_compile_openvino/README.md

+17-4
Original file line numberDiff line numberDiff line change
@@ -36,15 +36,21 @@ In this example, we use the following config:
3636
```bash
3737
echo "minWorkers: 1
3838
maxWorkers: 2
39-
pt2: {backend: openvino}" > model-config.yaml
39+
pt2:
40+
compile:
41+
enable: True
42+
backend: openvino" > model-config.yaml
4043
```
4144

4245
If you want to measure the handler `preprocess`, `inference`, `postprocess` times, use the following config:
4346

4447
```bash
4548
echo "minWorkers: 1
4649
maxWorkers: 2
47-
pt2: {backend: openvino}
50+
pt2:
51+
compile:
52+
enable: True
53+
backend: openvino
4854
handler:
4955
profile: true" > model-config.yaml
5056
```
@@ -132,7 +138,11 @@ Update the model-config.yaml file to specify the Inductor backend:
132138
```bash
133139
echo "minWorkers: 1
134140
maxWorkers: 2
135-
pt2: {backend: inductor, mode: reduce-overhead}
141+
pt2:
142+
compile:
143+
enable: True
144+
backend: inductor
145+
mode: reduce-overhead
136146
handler:
137147
profile: true" > model-config.yaml
138148
```
@@ -153,7 +163,10 @@ Update the model-config.yaml file to specify the OpenVINO backend:
153163
```bash
154164
echo "minWorkers: 1
155165
maxWorkers: 2
156-
pt2: {backend: openvino}
166+
pt2:
167+
compile:
168+
enable: True
169+
backend: openvino
157170
handler:
158171
profile: true" > model-config.yaml
159172
```

examples/pt2/torch_inductor_caching/README.md

+10-2
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,11 @@ Ex: `cd examples/pt2/torch_inductor_caching`
4141
In this example , we use the following config
4242

4343
```yaml
44-
pt2 : {backend: inductor, mode: max-autotune}
44+
pt2:
45+
compile:
46+
enable: True
47+
backend: inductor
48+
mode: max-autotune
4549
```
4650
4751
### Create model archive
@@ -126,7 +130,11 @@ Ex: `cd examples/pt2/torch_inductor_caching`
126130
In this example , we use the following config
127131

128132
```yaml
129-
pt2 : {backend: inductor, mode: max-autotune}
133+
pt2:
134+
compile:
135+
enable: True
136+
backend: inductor
137+
mode: max-autotune
130138
```
131139
132140
### Create model archive
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
minWorkers: 4
22
maxWorkers: 4
33
responseTimeout: 600
4-
pt2 : {backend: inductor, mode: max-autotune}
4+
pt2:
5+
compile:
6+
enable: True
7+
backend: inductor
8+
mode: max-autotune
59
handler:
610
torch_inductor_caching:
711
torch_inductor_cache_dir: "/home/ubuntu/serve/examples/pt2/torch_inductor_caching/cache"
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
minWorkers: 4
22
maxWorkers: 4
33
responseTimeout: 600
4-
pt2 : {backend: inductor, mode: max-autotune}
4+
pt2:
5+
compile:
6+
enable: True
7+
backend: inductor
8+
mode: max-autotune
59
handler:
610
torch_inductor_caching:
711
torch_inductor_fx_graph_cache: true
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
pt2:
2+
compile:
3+
enable: True
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pt2:
2+
compile:
3+
enable: False
4+
backend: inductor
5+
mode: reduce-overhead
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pt2:
2+
compile:
3+
enable: True
4+
backend: inductor
5+
mode: reduce-overhead

test/pytest/test_example_torch_compile.py

+22-19
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import os
2+
import sys
23
from pathlib import Path
34

45
import pytest
@@ -31,34 +32,36 @@
3132
EXPECTED_RESULTS = ["tabby", "tiger_cat", "Egyptian_cat", "lynx", "plastic_bag"]
3233

3334

34-
@pytest.fixture
35-
def custom_working_directory(tmp_path):
36-
# Set the custom working directory
37-
custom_dir = tmp_path / "model_dir"
38-
custom_dir.mkdir()
39-
os.chdir(custom_dir)
40-
yield custom_dir
41-
# Clean up and return to the original working directory
42-
os.chdir(tmp_path)
35+
@pytest.fixture(scope="function")
36+
def chdir_example(monkeypatch):
37+
# Change directory to example directory
38+
monkeypatch.chdir(EXAMPLE_ROOT_DIR)
39+
monkeypatch.syspath_prepend(EXAMPLE_ROOT_DIR)
40+
yield
4341

42+
# Teardown
43+
monkeypatch.undo()
4444

45-
@pytest.mark.skipif(PT2_AVAILABLE == False, reason="torch version is < 2.0")
46-
@pytest.mark.skip(reason="Skipping as its causing other testcases to fail")
47-
def test_torch_compile_inference(monkeypatch, custom_working_directory):
48-
monkeypatch.syspath_prepend(EXAMPLE_ROOT_DIR)
49-
# Get the path to the custom working directory
50-
model_dir = custom_working_directory
45+
# Delete imported model
46+
model = MODEL_FILE.split(".")[0]
47+
if model in sys.modules:
48+
del sys.modules[model]
5149

52-
try_and_handle(
53-
f"wget https://download.pytorch.org/models/{MODEL_PTH_FILE} -P {model_dir}"
54-
)
50+
51+
@pytest.mark.skipif(PT2_AVAILABLE == False, reason="torch version is < 2.0")
52+
def test_torch_compile_inference(chdir_example):
53+
# Download weights
54+
if not os.path.isfile(EXAMPLE_ROOT_DIR.joinpath(MODEL_PTH_FILE)):
55+
try_and_handle(
56+
f"wget https://download.pytorch.org/models/{MODEL_PTH_FILE} -P {EXAMPLE_ROOT_DIR}"
57+
)
5558

5659
# Handler for Image classification
5760
handler = ImageClassifier()
5861

5962
# Context definition
6063
ctx = MockContext(
61-
model_pt_file=model_dir.joinpath(MODEL_PTH_FILE),
64+
model_pt_file=MODEL_PTH_FILE,
6265
model_dir=EXAMPLE_ROOT_DIR.as_posix(),
6366
model_file=MODEL_FILE,
6467
model_yaml_config_file=MODEL_YAML_CFG_FILE,

test/pytest/test_torch_compile.py

+74-2
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,16 @@
33
import os
44
import platform
55
import subprocess
6+
import sys
67
import time
78
from pathlib import Path
89

910
import pytest
1011
import torch
1112
from pkg_resources import packaging
13+
from test_data.torch_compile.compile_handler import CompileHandler
14+
15+
from ts.torch_handler.unit_tests.test_utils.mock_context import MockContext
1216

1317
PT_2_AVAILABLE = (
1418
True
@@ -20,15 +24,42 @@
2024
CURR_FILE_PATH = Path(__file__).parent
2125
TEST_DATA_DIR = os.path.join(CURR_FILE_PATH, "test_data", "torch_compile")
2226

23-
MODEL_FILE = os.path.join(TEST_DATA_DIR, "model.py")
27+
MODEL = "model.py"
28+
MODEL_FILE = os.path.join(TEST_DATA_DIR, MODEL)
2429
HANDLER_FILE = os.path.join(TEST_DATA_DIR, "compile_handler.py")
2530
YAML_CONFIG_STR = os.path.join(TEST_DATA_DIR, "pt2.yaml") # backend as string
2631
YAML_CONFIG_DICT = os.path.join(TEST_DATA_DIR, "pt2_dict.yaml") # arbitrary kwargs dict
32+
YAML_CONFIG_ENABLE = os.path.join(
33+
TEST_DATA_DIR, "pt2_enable_true.yaml"
34+
) # arbitrary kwargs dict
35+
YAML_CONFIG_ENABLE_FALSE = os.path.join(
36+
TEST_DATA_DIR, "pt2_enable_false.yaml"
37+
) # arbitrary kwargs dict
38+
YAML_CONFIG_ENABLE_DEFAULT = os.path.join(
39+
TEST_DATA_DIR, "pt2_enable_default.yaml"
40+
) # arbitrary kwargs dict
2741

2842

2943
SERIALIZED_FILE = os.path.join(TEST_DATA_DIR, "model.pt")
3044
MODEL_STORE_DIR = os.path.join(TEST_DATA_DIR, "model_store")
3145
MODEL_NAME = "half_plus_two"
46+
EXPECTED_RESULT = 3.5
47+
48+
49+
@pytest.fixture(scope="function")
50+
def chdir_example(monkeypatch):
51+
# Change directory to example directory
52+
monkeypatch.chdir(TEST_DATA_DIR)
53+
monkeypatch.syspath_prepend(TEST_DATA_DIR)
54+
yield
55+
56+
# Teardown
57+
monkeypatch.undo()
58+
59+
# Delete imported model
60+
model = MODEL.split(".")[0]
61+
if model in sys.modules:
62+
del sys.modules[model]
3263

3364

3465
@pytest.mark.skipif(
@@ -119,7 +150,6 @@ def _response_to_tuples(response_str):
119150
os.environ.get("TS_RUN_IN_DOCKER", False),
120151
reason="Test to be run outside docker",
121152
)
122-
@pytest.mark.skip(reason="Test failing on regression runner")
123153
def test_serve_inference(self):
124154
request_data = {"instances": [[1.0], [2.0], [3.0]]}
125155
request_json = json.dumps(request_data)
@@ -146,3 +176,45 @@ def test_serve_inference(self):
146176
"Compiled model with backend inductor, mode reduce-overhead"
147177
in model_log
148178
)
179+
180+
@pytest.mark.parametrize(
181+
("compile"), ("disabled", "enabled", "enabled_reduce_overhead")
182+
)
183+
def test_compile_inference_enable_options(self, chdir_example, compile):
184+
# Reset dynamo
185+
torch._dynamo.reset()
186+
187+
# Handler
188+
handler = CompileHandler()
189+
190+
if compile == "enabled":
191+
model_yaml_config_file = YAML_CONFIG_ENABLE_DEFAULT
192+
elif compile == "disabled":
193+
model_yaml_config_file = YAML_CONFIG_ENABLE_FALSE
194+
elif compile == "enabled_reduce_overhead":
195+
model_yaml_config_file = YAML_CONFIG_ENABLE
196+
197+
# Context definition
198+
ctx = MockContext(
199+
model_pt_file=SERIALIZED_FILE,
200+
model_dir=TEST_DATA_DIR,
201+
model_file=MODEL,
202+
model_yaml_config_file=model_yaml_config_file,
203+
)
204+
205+
torch.manual_seed(42 * 42)
206+
handler.initialize(ctx)
207+
handler.context = ctx
208+
209+
# Check that model is compiled using dynamo
210+
if compile == "enabled" or compile == "enabled_reduce_overhead":
211+
assert isinstance(handler.model, torch._dynamo.OptimizedModule)
212+
else:
213+
assert not isinstance(handler.model, torch._dynamo.OptimizedModule)
214+
215+
# Data for testing
216+
data = {"body": {"instances": [[1.0], [2.0], [3.0]]}}
217+
218+
result = handler.handle([data], ctx)
219+
220+
assert result[0] == EXPECTED_RESULT

0 commit comments

Comments
 (0)