You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This example uses AOTInductor to compile the [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) into an so file (see script [aot_compile_export.py](aot_compile_export.py)). In PyTorch 2.2, the supported `MAX_SEQ_LENGTH` in this script is 511.
2
+
3
+
Then, this example loads model and runs prediction using libtorch. The handler C++ source code for this examples can be found [here](src).
4
+
5
+
### Setup
6
+
1. Follow the instructions in [README.md](../../../../cpp/README.md) to build the TorchServe C++ backend.
7
+
8
+
```
9
+
cd serve/cpp
10
+
./builld.sh
11
+
```
12
+
13
+
The build script will create the necessary artifact for this example.
14
+
To recreate these by hand you can follow the prepare_test_files function of the [build.sh](../../../../cpp/build.sh) script.
15
+
We will need the handler .so file as well as the bert-seq.so and tokenizer.json.
16
+
17
+
2. Create a [model-config.yaml](model-config.yaml)
0 commit comments