Skip to content

Commit 3469970

Browse files
authored
Merge branch 'master' into feature/remove_torchtext_dependency
2 parents 5d81a57 + c74a29e commit 3469970

File tree

160 files changed

+4432
-1254
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

160 files changed

+4432
-1254
lines changed

.github/ISSUE_TEMPLATE/bug.yml

+5-5
Original file line numberDiff line numberDiff line change
@@ -40,20 +40,20 @@ body:
4040
Did you install torchserve from source? Are you using Docker?
4141
placeholder: |
4242
Install torchserve from source: No
43-
Are you using Docker: Yes I ran ./build_image.sh
43+
Are you using Docker: Yes I ran ./build_image.sh
4444
validations:
4545
required: true
46-
46+
4747
- type: textarea
4848
attributes:
49-
label: Model Packaing
49+
label: Model Packaging
5050
description: |
5151
Please describe how you packaged your model
5252
placeholder: |
5353
Link to builtin handler or example you used or link to a repo or gist with your custom handler or step by step instructions with torch-model-archiver
5454
validations:
5555
required: true
56-
56+
5757
- type: textarea
5858
attributes:
5959
label: config.properties
@@ -86,7 +86,7 @@ body:
8686
torchserve --start
8787
```
8888
validations:
89-
required: true
89+
required: true
9090

9191
- type: textarea
9292
attributes:

.github/workflows/ci_cpu.yml

+8-5
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,19 @@ jobs:
2121
strategy:
2222
fail-fast: false
2323
matrix:
24-
os: [ubuntu-20.04, macOS-latest]
24+
os: [ubuntu-20.04, macos-latest]
2525
steps:
2626
- name: Setup Python for M1
27-
if: matrix.os == 'macos-14'
27+
if: matrix.os == 'macos-latest'
2828
uses: actions/setup-python@v5
2929
with:
3030
python-version: '3.10'
31+
architecture: arm64
3132
- name: Setup Python for all other OS
32-
if: matrix.os != 'macos-14'
33+
if: matrix.os != 'macos-latest'
3334
uses: actions/setup-python@v5
3435
with:
35-
python-version: 3.9
36+
python-version: '3.9'
3637
architecture: x64
3738
- name: Setup Java 17
3839
uses: actions/setup-java@v3
@@ -47,7 +48,9 @@ jobs:
4748
run: |
4849
python ts_scripts/install_dependencies.py --environment=dev
4950
- name: Torchserve Sanity
50-
uses: nick-fields/retry@v2
51+
env:
52+
TS_MAC_ARM64_CPU_ONLY: ${{ matrix.os == 'macos-latest' && 'True' || 'False' }}
53+
uses: nick-fields/retry@v3
5154
with:
5255
timeout_minutes: 60
5356
max_attempts: 3

.github/workflows/ci_gpu.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ jobs:
4545
run: |
4646
python ts_scripts/install_dependencies.py --environment=dev --cuda=cu121
4747
- name: Torchserve Sanity
48-
uses: nick-fields/retry@v2
48+
uses: nick-fields/retry@v3
4949
with:
5050
timeout_minutes: 60
5151
retry_on: error

.github/workflows/regression_tests_cpu.yml

+8-5
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,24 @@ concurrency:
1515

1616
jobs:
1717
regression-cpu:
18-
# creates workflows for OS: ubuntu, macOS, macOS M1
18+
# creates workflows for OS: ubuntu, macOS M1
1919
runs-on: ${{ matrix.os }}
2020
strategy:
2121
fail-fast: false
2222
matrix:
23-
os: [ubuntu-20.04, macOS-latest]
23+
os: [ubuntu-20.04, macos-latest]
2424
steps:
2525
- name: Setup Python for M1
26-
if: matrix.os == 'macos-14'
26+
if: matrix.os == 'macos-latest'
2727
uses: actions/setup-python@v5
2828
with:
2929
python-version: '3.10'
30+
architecture: arm64
3031
- name: Setup Python for all other OS
31-
if: matrix.os != 'macos-14'
32+
if: matrix.os != 'macos-latest'
3233
uses: actions/setup-python@v5
3334
with:
34-
python-version: 3.9
35+
python-version: '3.9'
3536
architecture: x64
3637
- name: Setup Java 17
3738
uses: actions/setup-java@v3
@@ -46,5 +47,7 @@ jobs:
4647
run: |
4748
python ts_scripts/install_dependencies.py --environment=dev
4849
- name: Torchserve Regression Tests
50+
env:
51+
TS_MAC_ARM64_CPU_ONLY: ${{ matrix.os == 'macos-latest' && 'True' || 'False' }}
4952
run: |
5053
python test/regression_tests.py

.github/workflows/regression_tests_cpu_binaries.yml

+14-19
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
strategy:
1717
fail-fast: false
1818
matrix:
19-
os: [ubuntu-20.04, macOS-latest]
19+
os: [ubuntu-20.04, macos-latest]
2020
python-version: ["3.8", "3.9", "3.10"]
2121
binaries: ["pypi", "conda"]
2222
exclude:
@@ -31,38 +31,33 @@ jobs:
3131
with:
3232
submodules: recursive
3333
- name: Setup conda with Python ${{ matrix.python-version }}
34-
if: matrix.os == 'macos-14'
3534
uses: conda-incubator/setup-miniconda@v3
3635
with:
3736
auto-update-conda: true
3837
channels: anaconda, conda-forge
3938
python-version: ${{ matrix.python-version }}
40-
- name: Setup conda with Python ${{ matrix.python-version }}
41-
if: matrix.os != 'macos-14'
42-
uses: s-weigand/setup-conda@v1
43-
with:
44-
update-conda: true
45-
python-version: ${{ matrix.python-version }}
46-
conda-channels: anaconda, conda-forge
4739
- name: Setup Java 17
4840
uses: actions/setup-java@v3
4941
with:
5042
distribution: 'zulu'
5143
java-version: '17'
5244
- name: Checkout TorchServe
5345
uses: actions/checkout@v3
54-
- name: Run install dependencies and regression test
55-
if: matrix.os == 'macos-14'
56-
shell: bash -el {0}
57-
run: |
58-
conda info
59-
python ts_scripts/install_dependencies.py --environment=dev
60-
python test/regression_tests.py --binaries --${{ matrix.binaries }} --nightly
6146
- name: Install dependencies
62-
if: matrix.os != 'macos-14'
47+
shell: bash -el {0}
6348
run: |
49+
echo "=====CHECK ENV AND PYTHON VERSION===="
50+
conda info --envs
51+
python --version
52+
echo "=====RUN INSTALL DEPENDENCIES===="
6453
python ts_scripts/install_dependencies.py --environment=dev
65-
- name: Validate Torchserve CPU Regression
66-
if: matrix.os != 'macos-14'
54+
- name: Torchserve Regression Tests
55+
shell: bash -el {0}
56+
env:
57+
TS_MAC_ARM64_CPU_ONLY: ${{ matrix.os == 'macos-latest' && 'True' || 'False' }}
6758
run: |
59+
echo "=====CHECK ENV AND PYTHON VERSION===="
60+
conda info --envs
61+
python --version
62+
echo "=====RUN REGRESSION TESTS===="
6863
python test/regression_tests.py --binaries --${{ matrix.binaries }} --nightly

.github/workflows/regression_tests_docker.yml

+3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
name: Run Regression Tests on Docker
22

33
on:
4+
push:
5+
tags:
6+
- docker
47
workflow_dispatch:
58
# run every day at 5:15am
69
schedule:

.github/workflows/regression_tests_gpu_binaries.yml

+4-9
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,7 @@ jobs:
3939
with:
4040
python-version: ${{ matrix.python-version }}
4141
architecture: x64
42-
- name: Setup Conda
43-
uses: s-weigand/setup-conda@v1
44-
with:
45-
update-conda: true
46-
python-version: ${{ matrix.python-version }}
47-
conda-channels: anaconda, conda-forge
42+
- run: python --version
4843
- run: conda --version
4944
- name: Setup Java 17
5045
uses: actions/setup-java@v3
@@ -53,17 +48,17 @@ jobs:
5348
java-version: '17'
5449
- name: Install dependencies
5550
shell: bash -el {0}
56-
run: |
51+
run: |
5752
echo "=====CHECK ENV AND PYTHON VERSION===="
5853
/home/ubuntu/actions-runner/_work/serve/serve/3/condabin/conda info --envs
5954
python --version
6055
echo "=====RUN INSTALL DEPENDENCIES===="
6156
python ts_scripts/install_dependencies.py --environment=dev --cuda=cu121
6257
- name: Torchserve Regression Tests
63-
shell: bash -el {0}
58+
shell: bash -el {0}
6459
run: |
6560
echo "=====CHECK ENV AND PYTHON VERSION===="
6661
/home/ubuntu/actions-runner/_work/serve/serve/3/condabin/conda info --envs
6762
python --version
6863
echo "=====RUN REGRESSION TESTS===="
69-
python test/regression_tests.py --binaries --${{ matrix.binaries }} --nightly
64+
python test/regression_tests.py --binaries --${{ matrix.binaries }} --nightly

CONTRIBUTING.md

+16-2
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,21 @@ Your contributions will fall into two categories:
4040

4141
Once you finish implementing a feature or bug-fix, please send a Pull Request to https://github.com/pytorch/serve.
4242

43-
For more non-technical guidance about how to contribute to PyTorch, see the Contributing Guide.
43+
New features should always be covered by at least one integration test.
44+
For guidance please have a look at our [current suite of pytest tests](https://github.com/pytorch/serve/tree/master/test/pytest) and orient yourself on a test that covers a similar use case as your new feature.
45+
A simplified version of an example test can be found in the [mnist template test](https://github.com/pytorch/serve/blob/master/test/pytest/test_mnist_template.py) which shows how to create a mar file on the fly and register it with TorchServe from within a test.
46+
You can run most tests by simply executing:
47+
```bash
48+
pytest test/pytest/test_mnist_template.py
49+
```
50+
To have a look at the TorchServe and/or test output add `-s` like this:
51+
```bash
52+
pytest -s test/pytest/test_mnist_template.py
53+
```
54+
To run only a subset or a single test from a file use `-k` like this:
55+
```bash
56+
pytest -k test/pytest/test_mnist_template.py
57+
```
4458

4559
### Install TorchServe for development
4660

@@ -50,7 +64,7 @@ Ensure that you have `python3` installed, and the user has access to the site-pa
5064

5165
Run the following script from the top of the source directory.
5266

53-
NOTE: This script force reinstalls `torchserve`, `torch-model-archiver` and `torch-workflow-archiver` if existing installations are found
67+
NOTE: This script force re-installs `torchserve`, `torch-model-archiver` and `torch-workflow-archiver` if existing installations are found
5468

5569
#### For Debian Based Systems/ MacOS
5670

README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ Refer to [torchserve docker](docker/README.md) for details.
7979
* Microsoft [DeepSpeed](examples/large_models/deepspeed), [DeepSpeed-Mii](examples/large_models/deepspeed_mii)
8080
* Hugging Face [Accelerate](examples/large_models/Huggingface_accelerate), [Diffusers](examples/diffusers)
8181
* Running large models on AWS [Sagemaker](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference-tutorials-torchserve.html) and [Inferentia2](https://pytorch.org/blog/high-performance-llama/)
82-
* Running [Llama 2 Chatbot locally on Mac](examples/LLM/llama2)
82+
* Running [Meta Llama Chatbot locally on Mac](examples/LLM/llama)
8383
* Monitoring using Grafana and [Datadog](https://www.datadoghq.com/blog/ai-integrations/#model-serving-and-deployment-vertex-ai-amazon-sagemaker-torchserve)
8484

8585

@@ -90,8 +90,8 @@ Refer to [torchserve docker](docker/README.md) for details.
9090

9191

9292
## 🏆 Highlighted Examples
93-
* [Serving Llama 2 with TorchServe](examples/LLM/llama2/README.md)
94-
* [Chatbot with Llama 2 on Mac 🦙💬](examples/LLM/llama2/chat_app)
93+
* [Serving Meta Llama with TorchServe](examples/LLM/llama/README.md)
94+
* [Chatbot with Meta Llama on Mac 🦙💬](examples/LLM/llama/chat_app)
9595
* [🤗 HuggingFace Transformers](examples/Huggingface_Transformers) with a [Better Transformer Integration/ Flash Attention & Xformer Memory Efficient ](examples/Huggingface_Transformers#Speed-up-inference-with-Better-Transformer)
9696
* [Stable Diffusion](examples/diffusers)
9797
* [Model parallel inference](examples/Huggingface_Transformers#model-parallelism)

SECURITY.md

+13-14
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@
44

55
| Version | Supported |
66
|---------| ------------------ |
7-
| 0.10.0 | :white_check_mark: |
7+
| 0.11.0 | :white_check_mark: |
88

99

1010
## How we do security
1111

12-
TorchServe as much as possible relies on automated tools to do security scanning, in particular we support
12+
13+
As much as possible, TorchServe relies on automated tools to do security scanning. In particular, we support:
14+
1315
1. Dependency Analysis: Using Dependabot
1416
2. Docker Scanning: Using Snyk
1517
3. Code Analysis: Using CodeQL
@@ -23,22 +25,22 @@ TorchServe as much as possible relies on automated tools to do security scanning
2325
These ports are accessible to `localhost` by default. The addresses can be configured by following the guides for
2426
[HTTP](https://github.com/pytorch/serve/blob/master/docs/configuration.md#configure-torchserve-listening-address-and-port) and
2527
[gRPC](https://github.com/pytorch/serve/blob/master/docs/configuration.md#configure-torchserve-grpc-listening-addresses-and-ports).
26-
TorchServe does not prevent users from configuring the address to be any value, including the wildcard address `0.0.0.0`.
28+
TorchServe does not prevent users from configuring the address to be of any value, including the wildcard address `0.0.0.0`.
2729
Please be aware of the security risks of configuring the address to be `0.0.0.0`, this will give all addresses(including publicly accessible addresses, if any)
28-
on the host, access to the TorchServer endpoints listening on the ports shown above.
29-
2. TorchServe's Docker image is configured to expose the ports `8080`, `8081`, `8082`, `7070`, `7071` to the host by [default](https://github.com/pytorch/serve/blob/master/docker/Dockerfile). When starting the container,
30-
make sure to map the ports exposed by the container to `localhost` ports or a specific IP address as shown in this [security guideline](https://github.com/pytorch/serve/blob/master/docker/README.md#security-guideline).
30+
on the host, access to the TorchServe endpoints listening on the ports shown above.
31+
2. By [default](https://github.com/pytorch/serve/blob/master/docker/Dockerfile), TorchServe's Docker image is configured to expose the ports `8080`, `8081`, `8082`, `7070`, `7071` to the host. When starting the container,
32+
map the ports exposed by the container to `localhost` ports or a specific IP address, as shown in this [security guideline](https://github.com/pytorch/serve/blob/master/docker/README.md#security-guideline).
3133

3234
3. Be sure to validate the authenticity of the `.mar` file being used with TorchServe.
3335
1. A `.mar` file being downloaded from the internet from an untrustworthy source may have malicious code, compromising the integrity of your application.
34-
2. TorchServe executes arbitrary python code packaged in the `mar` file. Make sure that you've either audited that the code you're using is safe and/or is from a source that you trust.
35-
3. Torchserve supports custom [plugins](https://github.com/pytorch/serve/tree/master/plugins) and [handlers](https://github.com/pytorch/serve/blob/master/docs/custom_service.md).
36+
2. TorchServe executes the arbitrary python code packaged in the `mar` file. Make sure that you've either audited that the code you're using is safe and/or is from a source that you trust.
37+
3. TorchServe supports custom [plugins](https://github.com/pytorch/serve/tree/master/plugins) and [handlers](https://github.com/pytorch/serve/blob/master/docs/custom_service.md).
3638
These can be utilized to extend TorchServe functionality to perform runtime security scanning using tools such as:
3739
- Clamd: https://pypi.org/project/clamd/
3840
- VirusTotal: https://virustotal.github.io/vt-py/
3941
- Fickling: https://github.com/trailofbits/fickling
40-
4. Running Torchserve inside a container environment and loading an untrusted `.mar` file does not guarantee isolation from a security perspective.
41-
4. By default TorchServe allows you to register models from all URLs. Make sure to set `allowed_urls` parameter in config.properties to restrict this. You can find more details in the [configuration guide](https://pytorch.org/serve/configuration.html#other-properties).
42+
4. Running TorchServe inside a container environment and loading an untrusted `.mar` file does not guarantee isolation from a security perspective.
43+
4. By default, TorchServe allows you to register models from all URLs. Make sure to set `allowed_urls` parameter in config.properties to restrict this. You can find more details in the [configuration guide](https://pytorch.org/serve/configuration.html#other-properties).
4244
- `use_env_allowed_urls=true` is required in config.properties to read `allowed_urls` from environment variable.
4345
5. Enable SSL:
4446

@@ -57,9 +59,6 @@ TorchServe as much as possible relies on automated tools to do security scanning
5759
7. If you intend to run multiple models in parallel with shared memory, it is your responsibility to ensure the models do not interact or access each other's data. The primary areas of concern are tenant isolation, resource allocation, model sharing and hardware attacks.
5860
8. TorchServe supports token authorization: check [documentation](https://github.com/pytorch/serve/blob/master/docs/token_authorization_api.md) for more information.
5961

60-
61-
62-
6362
## Reporting a Vulnerability
6463

65-
If you find a serious vulnerability please report it to https://www.facebook.com/whitehat and [email protected]
64+
If you find a vulnerability please report it to https://www.facebook.com/whitehat and [email protected]

binaries/conda/build_packages.py

+7-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,13 @@
2222
PACKAGES = ["torchserve", "torch-model-archiver", "torch-workflow-archiver"]
2323

2424
# conda convert supported platforms https://docs.conda.io/projects/conda-build/en/stable/resources/commands/conda-convert.html
25-
PLATFORMS = ["linux-64", "osx-64", "win-64", "osx-arm64"] # Add a new platform here
25+
PLATFORMS = [
26+
"linux-64",
27+
"osx-64",
28+
"win-64",
29+
"osx-arm64",
30+
"linux-aarch64",
31+
] # Add a new platform here
2632

2733
if os.name == "nt":
2834
# Assumes miniconda is installed in windows

docs/configuration.md

+10-2
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ inference_address=https://127.0.0.1:8443
9393
inference_address=https://172.16.1.10:8080
9494
```
9595

96-
### Configure TorchServe gRPC listening addresses and ports
96+
### Configure TorchServe gRPC listening addresses, ports and max connection age
9797
The inference gRPC API is listening on port 7070, and the management gRPC API is listening on port 7071 on localhost by default.
9898

9999
To configure different addresses use following properties
@@ -106,7 +106,15 @@ To configure different ports use following properties
106106
* `grpc_inference_port`: Inference gRPC API binding port. Default: 7070
107107
* `grpc_management_port`: management gRPC API binding port. Default: 7071
108108

109-
Here are a couple of examples:
109+
To configure [max connection age](https://grpc.github.io/grpc-java/javadoc/io/grpc/netty/NettyServerBuilder.html#maxConnectionAge(long,java.util.concurrent.TimeUnit)) (milliseconds)
110+
111+
* `grpc_inference_max_connection_age_ms`: Inference gRPC max connection age. Default: Infinite
112+
* `grpc_management_max_connection_age_ms`: Management gRPC max connection age. Default: Infinite
113+
114+
To configure [max connection age grace](https://grpc.github.io/grpc-java/javadoc/io/grpc/netty/NettyServerBuilder.html#maxConnectionAgeGrace(long,java.util.concurrent.TimeUnit)) (milliseconds)
115+
116+
* `grpc_inference_max_connection_age_grace_ms`: Inference gRPC max connection age grace. Default: Infinite
117+
* `grpc_management_max_connection_age_grace_ms`: Management gRPC max connection age grace. Default: Infinite
110118

111119
### Enable SSL
112120

0 commit comments

Comments
 (0)