Skip to content

Commit 89c5389

Browse files
udaij12agunapal
andauthored
Adding mps support to base handler and regression test (#3048)
* adding mps support to base handler and regression test * fixed method * mps support * fix format * changes to detection * testing x86 * adding m1 check * adding test cases * adding test workflow * modifiying tests * removing python tests * remove workflow * removing test config file * adding docs * fixing spell check * lint fix --------- Co-authored-by: Ankith Gunapal <[email protected]>
1 parent 8450a2e commit 89c5389

File tree

6 files changed

+349
-0
lines changed

6 files changed

+349
-0
lines changed

docs/apple_silicon_support.md

+129
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Apple Silicon Support
2+
3+
## What is supported
4+
* TorchServe CI jobs now include M1 hardware in order to ensure support, [documentation](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories) on github M1 hardware.
5+
- [Regression Tests](https://github.com/pytorch/serve/blob/master/.github/workflows/regression_tests_cpu.yml)
6+
- [Regression binaries Test](https://github.com/pytorch/serve/blob/master/.github/workflows/regression_tests_cpu_binaries.yml)
7+
* For [Docker](https://docs.docker.com/desktop/install/mac-install/) ensure Docker for Apple silicon is installed then follow [setup steps](https://github.com/pytorch/serve/tree/master/docker)
8+
9+
## Experimental Support
10+
11+
* For GPU jobs on Apple Silicon, [MPS](https://pytorch.org/docs/master/notes/mps.html) is now auto detected and enabled. To prevent TorchServe from using MPS, users have to set `deviceType: "cpu"` in model-config.yaml.
12+
* This is an experimental feature and NOT ALL models are guaranteed to work.
13+
* Number of GPUs now reports GPUs on Apple Silicon
14+
15+
### Testing
16+
* [Pytests](https://github.com/pytorch/serve/tree/master/test/pytest/test_device_config.py) that checks for MPS on MacOS M1 devices
17+
* Models that have been tested and work: Resnet-18, Densenet161, Alexnet
18+
* Models that have been tested and DO NOT work: MNIST
19+
20+
21+
#### Example Resnet-18 Using MPS On Mac M1 Pro
22+
```
23+
serve % torchserve --start --model-store model_store_gen --models resnet-18=resnet-18.mar --ncs
24+
25+
Torchserve version: 0.10.0
26+
Number of GPUs: 16
27+
Number of CPUs: 10
28+
Max heap size: 8192 M
29+
Python executable: /Library/Frameworks/Python.framework/Versions/3.11/bin/python3.11
30+
Config file: N/A
31+
Inference address: http://127.0.0.1:8080
32+
Management address: http://127.0.0.1:8081
33+
Metrics address: http://127.0.0.1:8082
34+
Model Store:
35+
Initial Models: resnet-18=resnet-18.mar
36+
Log dir:
37+
Metrics dir:
38+
Netty threads: 0
39+
Netty client threads: 0
40+
Default workers per model: 16
41+
Blacklist Regex: N/A
42+
Maximum Response Size: 6553500
43+
Maximum Request Size: 6553500
44+
Limit Maximum Image Pixels: true
45+
Prefer direct buffer: false
46+
Allowed Urls: [file://.*|http(s)?://.*]
47+
Custom python dependency for model allowed: false
48+
Enable metrics API: true
49+
Metrics mode: LOG
50+
Disable system metrics: false
51+
Workflow Store:
52+
CPP log config: N/A
53+
Model config: N/A
54+
024-04-08T14:18:02,380 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...
55+
2024-04-08T14:18:02,391 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: resnet-18.mar
56+
2024-04-08T14:18:02,699 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model resnet-18
57+
2024-04-08T14:18:02,699 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model resnet-18 loaded.
58+
2024-04-08T14:18:02,699 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: resnet-18, count: 16
59+
...
60+
...
61+
serve % curl http://127.0.0.1:8080/predictions/resnet-18 -T ./examples/image_classifier/kitten.jpg
62+
...
63+
{
64+
"tabby": 0.40966302156448364,
65+
"tiger_cat": 0.3467046618461609,
66+
"Egyptian_cat": 0.1300288736820221,
67+
"lynx": 0.02391958422958851,
68+
"bucket": 0.011532187461853027
69+
}
70+
...
71+
```
72+
#### Conda Example
73+
74+
```
75+
(myenv) serve % pip list | grep torch
76+
torch 2.2.1
77+
torchaudio 2.2.1
78+
torchdata 0.7.1
79+
torchtext 0.17.1
80+
torchvision 0.17.1
81+
(myenv3) serve % conda install -c pytorch-nightly torchserve torch-model-archiver torch-workflow-archiver
82+
(myenv3) serve % pip list | grep torch
83+
torch 2.2.1
84+
torch-model-archiver 0.10.0b20240312
85+
torch-workflow-archiver 0.2.12b20240312
86+
torchaudio 2.2.1
87+
torchdata 0.7.1
88+
torchserve 0.10.0b20240312
89+
torchtext 0.17.1
90+
torchvision 0.17.1
91+
(myenv3) serve % torchserve --start --ncs --models densenet161.mar --model-store ./model_store_gen/
92+
Torchserve version: 0.10.0
93+
Number of GPUs: 0
94+
Number of CPUs: 10
95+
Max heap size: 8192 M
96+
Config file: N/A
97+
Inference address: http://127.0.0.1:8080
98+
Management address: http://127.0.0.1:8081
99+
Metrics address: http://127.0.0.1:8082
100+
Initial Models: densenet161.mar
101+
Netty threads: 0
102+
Netty client threads: 0
103+
Default workers per model: 10
104+
Blacklist Regex: N/A
105+
Maximum Response Size: 6553500
106+
Maximum Request Size: 6553500
107+
Limit Maximum Image Pixels: true
108+
Prefer direct buffer: false
109+
Allowed Urls: [file://.*|http(s)?://.*]
110+
Custom python dependency for model allowed: false
111+
Enable metrics API: true
112+
Metrics mode: LOG
113+
Disable system metrics: false
114+
CPP log config: N/A
115+
Model config: N/A
116+
System metrics command: default
117+
...
118+
2024-03-12T15:58:54,702 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model densenet161 loaded.
119+
2024-03-12T15:58:54,702 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: densenet161, count: 10
120+
Model server started.
121+
...
122+
(myenv3) serve % curl http://127.0.0.1:8080/predictions/densenet161 -T examples/image_classifier/kitten.jpg
123+
{
124+
"tabby": 0.46661922335624695,
125+
"tiger_cat": 0.46449029445648193,
126+
"Egyptian_cat": 0.0661405548453331,
127+
"lynx": 0.001292439759708941,
128+
"plastic_bag": 0.00022909720428287983
129+
}

frontend/server/src/main/java/org/pytorch/serve/util/ConfigManager.java

+24
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,11 @@
55
import io.netty.handler.ssl.SslContext;
66
import io.netty.handler.ssl.SslContextBuilder;
77
import io.netty.handler.ssl.util.SelfSignedCertificate;
8+
import java.io.BufferedReader;
89
import java.io.File;
910
import java.io.IOException;
1011
import java.io.InputStream;
12+
import java.io.InputStreamReader;
1113
import java.lang.reflect.Field;
1214
import java.lang.reflect.Type;
1315
import java.net.InetAddress;
@@ -835,6 +837,28 @@ private static int getAvailableGpu() {
835837
for (String id : ids) {
836838
gpuIds.add(Integer.parseInt(id));
837839
}
840+
} else if (System.getProperty("os.name").startsWith("Mac")) {
841+
Process process = Runtime.getRuntime().exec("system_profiler SPDisplaysDataType");
842+
int ret = process.waitFor();
843+
if (ret != 0) {
844+
return 0;
845+
}
846+
847+
BufferedReader reader =
848+
new BufferedReader(new InputStreamReader(process.getInputStream()));
849+
String line;
850+
while ((line = reader.readLine()) != null) {
851+
if (line.contains("Chipset Model:") && !line.contains("Apple M1")) {
852+
return 0;
853+
}
854+
if (line.contains("Total Number of Cores:")) {
855+
String[] parts = line.split(":");
856+
if (parts.length >= 2) {
857+
return (Integer.parseInt(parts[1].trim()));
858+
}
859+
}
860+
}
861+
throw new AssertionError("Unexpected response.");
838862
} else {
839863
Process process =
840864
Runtime.getRuntime().exec("nvidia-smi --query-gpu=index --format=csv");

frontend/server/src/test/java/org/pytorch/serve/util/ConfigManagerTest.java

+14
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,18 @@ public void testNoWorkflowState() throws ReflectiveOperationException, IOExcepti
105105
workingDir + "/frontend/archive/src/test/resources/models",
106106
configManager.getWorkflowStore());
107107
}
108+
109+
@Test
110+
public void testNumGpuM1() throws ReflectiveOperationException, IOException {
111+
System.setProperty("tsConfigFile", "src/test/resources/config_test_env.properties");
112+
ConfigManager.Arguments args = new ConfigManager.Arguments();
113+
args.setModels(new String[] {"noop_v0.1"});
114+
args.setSnapshotDisabled(true);
115+
ConfigManager.init(args);
116+
ConfigManager configManager = ConfigManager.getInstance();
117+
String arch = System.getProperty("os.arch");
118+
if (arch.equals("aarch64")) {
119+
Assert.assertTrue(configManager.getNumberOfGpu() > 0);
120+
}
121+
}
108122
}

test/pytest/test_device_config.py

+168
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
import os
2+
import platform
3+
import shutil
4+
import tempfile
5+
from pathlib import Path
6+
from unittest.mock import patch
7+
8+
import pytest
9+
import requests
10+
import test_utils
11+
from model_archiver import ModelArchiverConfig
12+
13+
CURR_FILE_PATH = Path(__file__).parent
14+
REPO_ROOT_DIR = CURR_FILE_PATH.parent.parent
15+
ROOT_DIR = os.path.join(tempfile.gettempdir(), "workspace")
16+
REPO_ROOT = os.path.join(os.path.dirname(os.path.abspath(__file__)), "../../")
17+
data_file_zero = os.path.join(REPO_ROOT, "test/pytest/test_data/0.png")
18+
config_file = os.path.join(REPO_ROOT, "test/resources/config_token.properties")
19+
mnist_scriptes_py = os.path.join(REPO_ROOT, "examples/image_classifier/mnist/mnist.py")
20+
21+
HANDLER_PY = """
22+
from ts.torch_handler.base_handler import BaseHandler
23+
24+
class deviceHandler(BaseHandler):
25+
26+
def initialize(self, context):
27+
super().initialize(context)
28+
assert self.get_device().type == "mps"
29+
"""
30+
31+
MODEL_CONFIG_YAML = """
32+
#frontend settings
33+
# TorchServe frontend parameters
34+
minWorkers: 1
35+
batchSize: 4
36+
maxWorkers: 4
37+
"""
38+
39+
MODEL_CONFIG_YAML_GPU = """
40+
#frontend settings
41+
# TorchServe frontend parameters
42+
minWorkers: 1
43+
batchSize: 4
44+
maxWorkers: 4
45+
deviceType: "gpu"
46+
"""
47+
48+
MODEL_CONFIG_YAML_CPU = """
49+
#frontend settings
50+
# TorchServe frontend parameters
51+
minWorkers: 1
52+
batchSize: 4
53+
maxWorkers: 4
54+
deviceType: "cpu"
55+
"""
56+
57+
58+
@pytest.fixture(scope="module")
59+
def model_name():
60+
yield "mnist"
61+
62+
63+
@pytest.fixture(scope="module")
64+
def work_dir(tmp_path_factory, model_name):
65+
return Path(tmp_path_factory.mktemp(model_name))
66+
67+
68+
@pytest.fixture(scope="module")
69+
def model_config_name(request):
70+
def get_config(param):
71+
if param == "cpu":
72+
return MODEL_CONFIG_YAML_CPU
73+
elif param == "gpu":
74+
return MODEL_CONFIG_YAML_GPU
75+
else:
76+
return MODEL_CONFIG_YAML
77+
78+
return get_config(request.param)
79+
80+
81+
@pytest.fixture(scope="module", name="mar_file_path")
82+
def create_mar_file(work_dir, model_archiver, model_name, model_config_name):
83+
mar_file_path = work_dir.joinpath(model_name + ".mar")
84+
85+
model_config_yaml_file = work_dir / "model_config.yaml"
86+
model_config_yaml_file.write_text(model_config_name)
87+
88+
model_py_file = work_dir / "model.py"
89+
90+
model_py_file.write_text(mnist_scriptes_py)
91+
92+
handler_py_file = work_dir / "handler.py"
93+
handler_py_file.write_text(HANDLER_PY)
94+
95+
config = ModelArchiverConfig(
96+
model_name=model_name,
97+
version="1.0",
98+
serialized_file=None,
99+
model_file=mnist_scriptes_py, # model_py_file.as_posix(),
100+
handler=handler_py_file.as_posix(),
101+
extra_files=None,
102+
export_path=work_dir,
103+
requirements_file=None,
104+
runtime="python",
105+
force=False,
106+
archive_format="default",
107+
config_file=model_config_yaml_file.as_posix(),
108+
)
109+
110+
with patch("archiver.ArgParser.export_model_args_parser", return_value=config):
111+
model_archiver.generate_model_archive()
112+
113+
assert mar_file_path.exists()
114+
115+
yield mar_file_path.as_posix()
116+
117+
# Clean up files
118+
119+
mar_file_path.unlink(missing_ok=True)
120+
121+
# Clean up files
122+
123+
124+
@pytest.fixture(scope="module", name="model_name")
125+
def register_model(mar_file_path, model_store, torchserve):
126+
"""
127+
Register the model in torchserve
128+
"""
129+
shutil.copy(mar_file_path, model_store)
130+
131+
file_name = Path(mar_file_path).name
132+
133+
model_name = Path(file_name).stem
134+
135+
params = (
136+
("model_name", model_name),
137+
("url", file_name),
138+
("initial_workers", "1"),
139+
("synchronous", "true"),
140+
("batch_size", "1"),
141+
)
142+
143+
test_utils.reg_resp = test_utils.register_model_with_params(params)
144+
145+
yield model_name
146+
147+
test_utils.unregister_model(model_name)
148+
149+
150+
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1")
151+
@pytest.mark.parametrize("model_config_name", ["gpu"], indirect=True)
152+
def test_m1_device(model_name, model_config_name):
153+
response = requests.get(f"http://localhost:8081/models/{model_name}")
154+
assert response.status_code == 200, "Describe Failed"
155+
156+
157+
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1")
158+
@pytest.mark.parametrize("model_config_name", ["cpu"], indirect=True)
159+
def test_m1_device_cpu(model_name, model_config_name):
160+
response = requests.get(f"http://localhost:8081/models/{model_name}")
161+
assert response.status_code == 404, "Describe Worked"
162+
163+
164+
@pytest.mark.skipif(platform.machine() != "arm64", reason="Skip on Mac M1")
165+
@pytest.mark.parametrize("model_config_name", ["default"], indirect=True)
166+
def test_m1_device_default(model_name, model_config_name):
167+
response = requests.get(f"http://localhost:8081/models/{model_name}")
168+
assert response.status_code == 200, "Describe Failed"

ts/torch_handler/base_handler.py

+12
Original file line numberDiff line numberDiff line change
@@ -144,11 +144,15 @@ def initialize(self, context):
144144
self.model_yaml_config = context.model_yaml_config
145145

146146
properties = context.system_properties
147+
147148
if torch.cuda.is_available() and properties.get("gpu_id") is not None:
148149
self.map_location = "cuda"
149150
self.device = torch.device(
150151
self.map_location + ":" + str(properties.get("gpu_id"))
151152
)
153+
elif torch.backends.mps.is_available() and properties.get("gpu_id") is not None:
154+
self.map_location = "mps"
155+
self.device = torch.device("mps")
152156
elif XLA_AVAILABLE:
153157
self.device = xm.xla_device()
154158
else:
@@ -524,3 +528,11 @@ def describe_handle(self):
524528
# pylint: disable=unnecessary-pass
525529
pass
526530
# pylint: enable=unnecessary-pass
531+
532+
def get_device(self):
533+
"""Get device
534+
535+
Returns:
536+
string : self device
537+
"""
538+
return self.device

ts_scripts/spellcheck_conf/wordlist.txt

+2
Original file line numberDiff line numberDiff line change
@@ -1216,3 +1216,5 @@ libomp
12161216
rpath
12171217
venv
12181218
TorchInductor
1219+
Pytests
1220+
deviceType

0 commit comments

Comments
 (0)