Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

高性能插件部署通用版面解析产线v1报错 #3514

Open
Danee-wawawa opened this issue Mar 4, 2025 · 58 comments
Open

高性能插件部署通用版面解析产线v1报错 #3514

Danee-wawawa opened this issue Mar 4, 2025 · 58 comments
Assignees

Comments

@Danee-wawawa
Copy link

1、采用官方提供的docker环境部署,docker run --gpus all --name paddlex -v $PWD:/paddle --shm-size=8G --network=host -it ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.0.0rc0-gpu-cuda11.8-cudnn8.6-trt8.5 /bin/bash
2、然后采用高性能启动服务:paddlex --serve --pipeline layout_parsing.yaml --use_hpip
3、上传图片之后界面显示如下报错信息:
E0304 01:51:21.861023 190774 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
4、但是流程并没有暂停,在等待很久之后也返回了版面解析结果,我的图片大小是1190*1684,花了6分多钟才返回结果

想问下高性能插件怎么会这么慢?是不是我哪里处理错了,麻烦帮忙看下,谢谢~

@Bobholamovic
Copy link
Member

你好,你观察到的情况是正常且符合预期的。实际上高性能推理插件结合当前运行环境信息与储存的先验知识为每个模型自动选择“最优”的推理后端(推理引擎),当选中TensorRT后端时,在第一次运行时将构建引擎,这可能需要较长时间。从第二次运行开始,将使用缓存的引擎,通常就不再需要等待这么长时间了。Error Code 3: API Usage Error这个信息可能提示构建引擎过程中遇到的某种错误,但考虑到最终引擎成功构建了,这个信息应该可以忽略。

@Danee-wawawa
Copy link
Author

你好,我的这个情况是已经进行过第一次运行,然后我关了服务之后(ctrl+c),再重启服务paddlex --serve --pipeline layout_parsing.yaml --use_hpip(没有删除第一次运行保存的文件),测试依然非常慢,一张图片要6 7分钟~

@Bobholamovic
Copy link
Member

可以确认一下模型存储的位置是否生成了trt_serialized*缓存文件

@Danee-wawawa
Copy link
Author

嗯嗯,我看到有生成trt_serialized*缓存文件,在通用版面解析产线v1里我使用了如下模型:
1、版面区域检测模型

Image
2、文本检测模型

Image

3、文本识别模型

Image

4、表格识别模型

Image

除了表格识别模型没有生成trt_serialized缓存文件(应该是产线默认不用生成?),其他模型都有生成trt_serialized缓存文件,所以不知道问题是出现在哪里?

@Bobholamovic
Copy link
Member

有可能是SLANet_plus模型在当前环境中自动选择的后端并不是trt类的后端,也就不需要生成构建缓存。实际上,在加载模型时,你应该可以在程序打印的日志中看到实际选择的后端的信息。

如果缓存文件存在,但仍发现一张图片要6 7分钟,很可能是因为模型输入的尺寸超出了预设的动态范围,触发了重新构建引擎。如果你能确定实际需要处理的输入的形状范围,可以参考高性能推理文档中的相关说明,通过修改产线配置文件调整构建引擎使用的动态形状范围(再重新执行程序前记得清空缓存);如果不能,你可以多找一些数据,对服务进行预热,这些数据应该尽可能与实际接近,这样经过几次重新构建后,各个模型的动态形状范围应该会被自动更新到满足实际数据的要求。

@Danee-wawawa
Copy link
Author

好的,谢谢~有个问题我想确认下,如果我多找一些数据,对服务进行预热之后,然后我通过ctrl+c关了服务之后,再马上重启服务paddlex --serve --pipeline layout_parsing.yaml --use_hpip,那我之前的预热还有用吗?还是说只要断了服务,下一次重启都需要重新预热?

@Bobholamovic
Copy link
Member

正常情况下,通过预热更新的信息会被写入到缓存文件中,这样下一次启动服务不再需要重新预热。如果你发现程序的实际行为不符合预期,欢迎向我们汇报bug~

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 5, 2025

您好,我的操作如下:
1、首次运行,生成了缓存文件(时间较久),ctrl+c关了服务;
2、第二次运行,测试一张1190*1684的图片,时间较久,6分钟左右,文档解析结果输出后,ctrl+c关了服务;
最后几行会输出以下log:
W0305 09:09:20.898191 269715 place.cc:253] The paddle::PlaceType::kCPU/kGPU is deprecated since version 2.3, and will be removed in version 2.4! Please use Tensor::is_cpu()/is_gpu() method to determine the type of place.
WARNING: Logging before InitGoogleLogging() is written to STDERR

E0305 09:13:44.520833 269715 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
E0305 09:16:07.165756 269715 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
E0305 09:18:30.029475 269715 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.

INFO: xxxxxxxxxxxx:52405 - "POST /layout-parsing HTTP/1.1" 200 OK
3、第三次运行,依旧测试的是第二次运行时上传的图片,时间依旧较久,大概6 7分钟左右。
也会输出和上面一样的log。

我第二次和第三次输入的是相同图片,那是不是就不会存在输入尺寸动态变化的问题?
上面这样的情况,是不是有bug?

@Bobholamovic
Copy link
Member

如果在第二次运行的时候,使用相同的图片重复请求,可以观察到缓存生效吗?

@Danee-wawawa
Copy link
Author

在第二次运行的时候,使用相同的图片重复请求,缓存应该是生效的,因为时间快了很多,基本上1-2秒一张图。

@Bobholamovic
Copy link
Member

这样的话,这看起来是一个bug,辛苦 @zhang-prog 看看这个问题呢

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 6, 2025

感谢~期待可以尽快解决这个问题~

@Danee-wawawa
Copy link
Author

另外我还发现一个问题,我在配置文件layout_parsing.yaml里面已经将每个模型的高性能推理配置都进行指定,即gpu: paddle_infer,以文本检测为例,如下所示:

Image

但是我在运行paddlex --serve --pipeline layout_parsing.yaml --use_hpip之后,输出信息依然会有一个模型使用了tensorrt的backend,如下所示:

Image

这是什么原因呢?

@zhang-prog
Copy link
Collaborator

您好,我这边没有复现您的问题,我是按如下步骤去测试的:

  1. 启动新服务。
  2. 使用预设动态shape范围内的图去测试,立刻给出推理结果。(符合预期✅)
  3. 使用预设shape范围外的图去测试,重新构建引擎。(符合预期✅)
  4. 再次使用预设shape范围外的图去测试,立刻给出推理结果。(符合预期✅)
  5. 结束服务。
  6. 再次启动服务。
  7. 使用第3、4步的图去测试,立刻给出推理结果,说明重新构建的缓存生效。(符合预期✅)

麻烦您先贴一下您的paddlex和ultra-infer版本,然后试试把官方模型都删除之后(rm -rf ~/.paddlex/official_models/),重新按照我上面的步骤试验一下。

@Bobholamovic
Copy link
Member

另外我还发现一个问题,我在配置文件layout_parsing.yaml里面已经将每个模型的高性能推理配置都进行指定,即gpu: paddle_infer,以文本检测为例,如下所示:

Image

但是我在运行paddlex --serve --pipeline layout_parsing.yaml --use_hpip之后,输出信息依然会有一个模型使用了tensorrt的backend,如下所示:

Image

这是什么原因呢?

这个问题也辛苦 @zhang-prog 确认一下是不是bug

@zhang-prog
Copy link
Collaborator

无法指定后端的问题已提PR修复,#3534
着急的话可以按照PR中对paddlex/inference/pipelines/base.py的修改进行调整,不过不是每个模型都支持使用所有后端噢,我们的默认配置是使用能跑通且性能较优的后端。更多详情可以看高性能推理文档

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 6, 2025

感谢,我的paddlex版本是paddlex3.0.0rc0,ultra-infer的版本是'1.0.0'。我是通过下面的方式构建镜像和容器,并启动服务的
docker run -d --restart=always -p 28004:8080 --gpus=all -e CUDA_VISIBLE_DEVICES=0 --name paddlex-cuda11.8 -v /data/PaddleX:/paddlex -w /paddlex --network=host -it ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlex/paddlex:paddlex3.0.0rc0-paddlepaddle3.0.0rc0-gpu-cuda11.8-cudnn8.6-trt8.5 /bin/bash

paddlex --serve --pipeline layout_parsing.yaml --use_hpip

另外想问下,
1、您是如何构建环境运行的,也是docker吗?
2、您启动新服务的命令是什么?
3、使用预设动态shape范围内的图去测试,这个图的尺寸是多少?测试图片方便给一下吗~
4、使用预设shape范围外的图去测试,这个图的尺寸是多少?测试图片方便给一下吗~

我想先完全按照您的去复现,看看是否有问题~
@zhang-prog

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 6, 2025

您好,我应该定位到bug在哪里了。需要你们解决下~~首先默认的预设动态shape是最小[1, 3, 128, 64], 最大 [8, 3, 512, 278]。具体情况如下:

1、启动新服务。
2、使用预设动态shape范围内的图(大小为120x60)去测试,立刻给出推理结果。
3、使用预设shape范围外的图(大小为434x378)去测试,会重新构建引擎,然后结果输出,不会报错!!!截图如下:

Image

4、结束服务。
5、再次启动服务。
6、再次使用第3步的图(大小为434x378)去测试,立刻给出推理结果,说明第3步缓存有更新和保存。
7、使用预设shape范围外的图(这次更大,为842x595)去测试,会重新构建引擎,然后有报错信息!!!而且时间很久才完成测试。截图如下:

Image

8、结束服务。
9、再次启动服务。
10、使用第7步的图(大小为842x595)去测试,有相同的报错信息,也是时间很久才完成测试,这是不是说明第7步的缓存没有更新与保存?

从以上的情况看,好像只要出现以下报错信息,当前更新的缓存就不会保存下来。不清楚是不是paddlex现在的tensorrt构建有问题~
E0304 01:51:21.861023 190774 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.

@zhang-prog @Bobholamovic

@zhang-prog
Copy link
Collaborator

好的,感谢您的定位,目前和我们发现的问题一致。这个问题我们已经在修复中,晚些会给您解决方法~@Danee-wawawa

@zhang-prog
Copy link
Collaborator

您好,我们已经修复了这个问题,麻烦重新装下高性能推理所需要的包即可。执行以下指令:

pip cache purge
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl --force-reinstall --no-deps
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/paddlex_hpi-3.0.0rc0-py3-none-any.whl --force-reinstall --no-deps

@Danee-wawawa
Copy link
Author

好的,我明天试一下,感谢~

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

您好,想先确认下,需要把之前按照这个安装的插件paddlex --install hpi-cpu卸载掉吗?
@zhang-prog

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

无法指定后端的问题已提PR修复,#3534 着急的话可以按照PR中对paddlex/inference/pipelines/base.py的修改进行调整,不过不是每个模型都支持使用所有后端噢,我们的默认配置是使用能跑通且性能较优的后端。更多详情可以看高性能推理文档

另外关于上面这个问题,我按照您的修改对paddlex/inference/pipelines/base.py的修改进行了调整,如下图所示:

Image

但是依然无法奏效,必须去修改每个模型文件夹里的inference.yaml文件里面的参数才行,对应参数如下图所示:

Image

麻烦您再确认下~
@zhang-prog

@Danee-wawawa
Copy link
Author

您好,我们已经修复了这个问题,麻烦重新装下高性能推理所需要的包即可。执行以下指令:

pip cache purge
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl --force-reinstall --no-deps
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/paddlex_hpi-3.0.0rc0-py3-none-any.whl --force-reinstall --no-deps

您好,按照上述步骤重新装高性能推理所需要的包之后,依然遇到了相同的问题,并没有解决BUG~~辛苦看下

Image

@zhang-prog

@zhang-prog
Copy link
Collaborator

您需要本地安装PaddleX才可以哈,不然本地的改变无法影响到您安装的PaddleX。
执行如下指令,然后您再修改就可以生效了。

cd /PaddleX
pip install -e .

配置文件可以参考这一段,如果您要修改dynamic_shape,需要把之前的缓存删除再去重建构建引擎哈。否则新设置的dynamic_shape可能不会生效。


pipeline_name: image_classification

SubModules:
  ImageClassification:
    module_name: image_classification
    model_name: ResNet18
    model_dir: null
    batch_size: 4
    topk: 5 
    hpi_params:
      config:
        selected_backends:
          cpu: openvino # 可选:paddle_infer, openvino, onnx_runtime
          gpu: paddle_infer # 可选:paddle_infer, onnx_runtime, tensorrt
        backend_config:
          # Paddle Inference 后端配置
          paddle_infer:
            enable_trt: True # 可选:True, False
            trt_precision: FP32 # 当 enable_trt 为 True 时,可选:FP32, FP16
            trt_dynamic_shapes: 
              x:
              - - 1
                - 3
                - 224
                - 224
              - - 1
                - 3
                - 224
                - 224
              - - 8
                - 3
                - 224
                - 224
          # TensorRT 后端配置
          tensorrt:
            precision: FP32 # 可选:FP32, FP16

@zhang-prog
Copy link
Collaborator

首先请确保您成功重装了那两个包,可以观察下有没有下载包并且 reinstall。

一般超出尺度会使得trt重新构建一次,但可以保留下缓存,之后相同的图片的请求不会再重建引擎。昨天我测试是可以的,您多试几次看看哈。

我们修复后的测试流程:

  1. 启动服务
  2. 使用734x1048图片请求,重新构建trt引擎。(符合预期,只会重新构建一次)
  3. 多次使用第2步的图片请求,立刻得到结果。(符合预期,缓存生效)
  4. 重启服务
  5. 使用第2步的图片请求,立刻得到结果。(符合预期,缓存生效)
  6. 使用514x64图片请求,重新构建trt引擎。(符合预期,只会会重新构建一次)
  7. 多次使用第6步的图片请求,立刻得到结果。(符合预期,缓存生效)
  8. 使用第2步的图片请求,立刻得到结果。(符合预期,缓存生效)
  9. 重启服务
  10. 使用第2步的图片请求,立刻得到结果。(符合预期,缓存生效)
  11. 使用第6步的图片请求,立刻得到结果。(符合预期,缓存生效)

@Danee-wawawa
Copy link
Author

您好,您说的本地安装是指不通过直接拉取镜像的方式,而是通过git clone https://github.com/PaddlePaddle/PaddleX.git之后,然后
cd /PaddleX
pip install -e .

安装PaddleX是吗?

@zhang-prog
Copy link
Collaborator

是的,使用release/3.0-rc分支

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

好的,因为我们的项目必须要在docker环境下进行,那我是不是可以先拉取一个paddlepaddle3.0.0rc0版本的docker,如下:
docker run --gpus all --name paddlex -v $PWD:/paddle --shm-size=8G --network=host -it ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddle:3.0.0rc0-gpu-cuda11.8-cudnn8.6-trt8.5 /bin/bash

然后在上面构建的容器里面进行paddlex的release/3.0-rc分支的安装呢?

@zhang-prog
Copy link
Collaborator

完全可以的

@Danee-wawawa
Copy link
Author

好咧,感谢~我下午重新构建下环境再试下~

@zhang-prog
Copy link
Collaborator

我已经将这部分修改同步到了release/3.0-rc分支,您进容器,然后切到这个分支,拉下最新代码,pip install -e .本地安装即可。

@Danee-wawawa
Copy link
Author

好的👌

@Danee-wawawa
Copy link
Author

您好,我目前运行pip install -e .之后,遇到了一个错误信息,不知道是否会影响后续的开发,先和您同步一下。

Image

@zhang-prog
Copy link
Collaborator

apt remove python3-yaml 执行这句,再重新 pip install -e .

@Danee-wawawa
Copy link
Author

好的👌

@Danee-wawawa
Copy link
Author

您好,我们已经修复了这个问题,麻烦重新装下高性能推理所需要的包即可。执行以下指令:

pip cache purge
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl --force-reinstall --no-deps
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/paddlex_hpi-3.0.0rc0-py3-none-any.whl --force-reinstall --no-deps

那我还需要执行上面的三个指令不?

@zhang-prog
Copy link
Collaborator

需要的,留意一下是否重新安装了哈

@Danee-wawawa
Copy link
Author

嗯嗯,显示下面的信息应该是重新安装成功了吧~

Image

然后我运行paddlex --serve --pipeline layout_parsing --use_hpip之后,报了一个安装包没有安装,如下所示:

Image

@zhang-prog
Copy link
Collaborator

要安装服务化部署插件噢,指令:

paddlex --install serving

另外,你是新开了一个容器吗?新开容器的话安装高性能推理包的时候不用加 --force-reinstall--no-deps,直接这样安装就行,麻烦再执行下:

pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/paddlex_hpi-3.0.0rc0-py3-none-any.whl

@Danee-wawawa
Copy link
Author

要安装服务化部署插件噢,指令:

paddlex --install serving

另外,你是新开了一个容器吗?新开容器的话安装高性能推理包的时候不用加 --force-reinstall--no-deps,直接这样安装就行,麻烦再执行下:

pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/paddlex_hpi-3.0.0rc0-py3-none-any.whl

好的,忘记执行paddlex --install serving了,
是的,我新开了一个容器,那我重新执行下~

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

您好,我在执行这个命令
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/deploy/hpi/ultra_infer/releases/3.0.0rc0/ultra_infer_gpu_python-1.0.0.3.0.0rc0-cp310-cp310-linux_x86_64.whl之后显示如下:

Image

是我的环境不对吗?
@zhang-prog

@zhang-prog
Copy link
Collaborator

这个信息不用管的,提示信息

@Danee-wawawa
Copy link
Author

您好,环境我都配置好了,然后运行paddlex --serve --pipeline layout_parsing --use_hpip,可以启动服务,然后我测试了一张图片,但还是出现和之前一样的情况,报了之前的错误~

Image

@zhang-prog

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

我把我两次启动服务的log放在这里,您看下,第二次就启动不起来了,一直卡在最后那里~

λ 07e8af1fba50 /paddlex_git/PaddleX/test paddlex --serve --pipeline layout_parsing_ori.yaml --use_hpip
Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-LCNet_x1_0_doc_ori_infer.tar ...
Downloading PP-LCNet_x1_0_doc_ori_infer.tar ...
[==================================================] 100.00%
Extracting PP-LCNet_x1_0_doc_ori_infer.tar
[==================================================] 100.00%
Only Paddle model is detected. Paddle model will be used by default.
Backend: tensorrt
Backend config: precision='FP16' dynamic_shapes={'x': [[1, 3, 224, 224], [1, 3, 224, 224], [8, 3, 224, 224]]}
[INFO] ultra_infer/vision/common/processors/transform.cc(91)::FuseNormalizeHWC2CHW	Normalize and HWC2CHW are fused to NormalizeAndPermute  in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(157)::FuseNormalizeColorConvert	BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(567)::BuildTrtEngine	[TrtBackend] Use FP16 to inference.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine	Start to building TensorRT Engine...
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(659)::BuildTrtEngine	TensorRT Engine is built successfully.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(661)::BuildTrtEngine	Serialize TensorRTEngine to local file /root/.paddlex/official_models/PP-LCNet_x1_0_doc_ori/trt_serialized_FP16.trt.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(672)::BuildTrtEngine	TensorRTEngine is serialized to local file /root/.paddlex/official_models/PP-LCNet_x1_0_doc_ori/trt_serialized_FP16.trt, we can load this model from the seralized engine directly next time.
[INFO] ultra_infer/runtime/runtime.cc(314)::CreateTrtBackend	Runtime initialized with Backend::TRT in Device::GPU.
Using official model (UVDoc), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/UVDoc_infer.tar ...
Downloading UVDoc_infer.tar ...
[==================================================] 100.00%
Extracting UVDoc_infer.tar
[==================================================] 100.00%
Only Paddle model is detected. Paddle model will be used by default.
Backend: tensorrt
Backend config: precision='FP16' dynamic_shapes={'image': [[1, 3, 128, 64], [1, 3, 256, 128], [8, 3, 512, 256]]}
[Paddle2ONNX] Opset version will change to 16 from 11
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(567)::BuildTrtEngine	[TrtBackend] Use FP16 to inference.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine	Start to building TensorRT Engine...
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(659)::BuildTrtEngine	TensorRT Engine is built successfully.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(661)::BuildTrtEngine	Serialize TensorRTEngine to local file /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(672)::BuildTrtEngine	TensorRTEngine is serialized to local file /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt, we can load this model from the seralized engine directly next time.
[INFO] ultra_infer/runtime/runtime.cc(314)::CreateTrtBackend	Runtime initialized with Backend::TRT in Device::GPU.
Using official model (PicoDet-L_layout_17cls), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PicoDet-L_layout_17cls_infer.tar ...
Downloading PicoDet-L_layout_17cls_infer.tar ...
[==================================================] 100.00%
Extracting PicoDet-L_layout_17cls_infer.tar
[==================================================] 100.00%
Only Paddle model is detected. Paddle model will be used by default.
Backend: paddle_infer
Backend config: cpu_num_threads=8 enable_mkldnn=True enable_trt=True trt_dynamic_shapes={'image': [[1, 3, 640, 640], [1, 3, 640, 640], [8, 3, 640, 640]], 'scale_factor': [[1, 2], [1, 2], [8, 2]]} trt_dynamic_shape_input_data={'scale_factor': [[2.0, 2.0], [1.0, 1.0], [0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67]]} trt_precision='FP16' enable_log_info=False
[INFO] ultra_infer/vision/common/processors/transform.cc(44)::FuseNormalizeCast	Normalize and Cast are fused to Normalize in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(91)::FuseNormalizeHWC2CHW	Normalize and HWC2CHW are fused to NormalizeAndPermute  in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(157)::FuseNormalizeColorConvert	BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(28)::BuildOption	Will inference_precision float32
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(73)::BuildOption	Will try to use tensorrt fp16 inference with Paddle Backend.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(79)::BuildOption	Detect that tensorrt cache file has been set to /root/.paddlex/official_models/PicoDet-L_layout_17cls/trt_serialized.trt, but while enable paddle2trt, please notice that the cache file will save to the directory where paddle model saved.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(173)::BuildOption	Currently, Paddle-TensorRT does not support the new IR, and the old IR will be used.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(288)::InitFromPaddle	Start generating shape range info file.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0307 06:49:51.663447 39223 analysis_config.cc:1475] In CollectShapeInfo mode, we will disable optimizations and collect the shape information of all intermediate tensors in the compute graph and calculate the min_shape, max_shape and opt_shape.
I0307 06:49:51.684185 39223 analysis_predictor.cc:2057] Ir optimization is turned off, no ir pass will be executed.
--- Running analysis [ir_graph_build_pass]
I0307 06:49:51.688817 39223 executor.cc:183] Old Executor is Running.
--- Running analysis [ir_analysis_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
I0307 06:49:51.711777 39223 ir_params_sync_among_devices_pass.cc:50] Sync params from CPU to GPU
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0307 06:49:51.712657 39223 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.4, Runtime API Version: 11.8
W0307 06:49:51.713122 39223 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [save_optimized_model_pass]
--- Running analysis [ir_graph_to_program_pass]
I0307 06:49:51.783936 39223 analysis_predictor.cc:2146] ======= ir optimization completed =======
I0307 06:49:51.785835 39223 naive_executor.cc:211] ---  skip [feed], feed -> scale_factor
I0307 06:49:51.785847 39223 naive_executor.cc:211] ---  skip [feed], feed -> image
I0307 06:49:51.787603 39223 naive_executor.cc:211] ---  skip [multiclass_nms3_0.tmp_0], fetch -> fetch
I0307 06:49:51.787613 39223 naive_executor.cc:211] ---  skip [multiclass_nms3_0.tmp_2], fetch -> fetch
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(551)::GetDynamicShapeFromOption	image: the max shape = [8, 3, 640, 640], the min shape = [1, 3, 640, 640], the opt shape = [1, 3, 640, 640]
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(551)::GetDynamicShapeFromOption	scale_factor: the max shape = [8, 2], the min shape = [1, 2], the opt shape = [1, 2]
W0307 06:49:51.797943 39223 analysis_predictor.cc:2646] When collecting shapes, it is recommended to run multiple loops to obtain more accurate shape information.
I0307 06:49:51.798503 39223 program_interpreter.cc:243] New Executor is Running.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(321)::InitFromPaddle	Finish generating shape range info file.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(323)::InitFromPaddle	Start loading shape range info file /root/.paddlex/official_models/PicoDet-L_layout_17cls/shape_range_info.pbtxt to set TensorRT dynamic shape.
E0307 06:49:53.382944 39223 helper.h:131] Could not register plugin creator -  ::BatchedNMSDynamic_TRT version 1
E0307 06:49:53.382982 39223 helper.h:131] Could not register plugin creator -  ::BatchedNMS_TRT version 1
E0307 06:49:53.382990 39223 helper.h:131] Could not register plugin creator -  ::BatchTilePlugin_TRT version 1
E0307 06:49:53.382998 39223 helper.h:131] Could not register plugin creator -  ::Clip_TRT version 1
E0307 06:49:53.383021 39223 helper.h:131] Could not register plugin creator -  ::CoordConvAC version 1
E0307 06:49:53.383030 39223 helper.h:131] Could not register plugin creator -  ::CropAndResizeDynamic version 1
E0307 06:49:53.383041 39223 helper.h:131] Could not register plugin creator -  ::CropAndResize version 1
E0307 06:49:53.383050 39223 helper.h:131] Could not register plugin creator -  ::DecodeBbox3DPlugin version 1
E0307 06:49:53.383060 39223 helper.h:131] Could not register plugin creator -  ::DetectionLayer_TRT version 1
E0307 06:49:53.383075 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Explicit_TF_TRT version 1
E0307 06:49:53.383088 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Implicit_TF_TRT version 1
E0307 06:49:53.383096 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_ONNX_TRT version 1
E0307 06:49:53.383107 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_TRT version 1
E0307 06:49:53.383117 39223 helper.h:131] Could not register plugin creator -  ::FlattenConcat_TRT version 1
E0307 06:49:53.383131 39223 helper.h:131] Could not register plugin creator -  ::fMHA_V2 version 1
E0307 06:49:53.383138 39223 helper.h:131] Could not register plugin creator -  ::fMHCA version 1
E0307 06:49:53.383150 39223 helper.h:131] Could not register plugin creator -  ::GenerateDetection_TRT version 1
E0307 06:49:53.383160 39223 helper.h:131] Could not register plugin creator -  ::GridAnchor_TRT version 1
E0307 06:49:53.383172 39223 helper.h:131] Could not register plugin creator -  ::GridAnchorRect_TRT version 1
E0307 06:49:53.383184 39223 helper.h:131] Could not register plugin creator -  ::GroupNorm version 1
E0307 06:49:53.383193 39223 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 1
E0307 06:49:53.383204 39223 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 2
E0307 06:49:53.383212 39223 helper.h:131] Could not register plugin creator -  ::LayerNorm version 1
E0307 06:49:53.383224 39223 helper.h:131] Could not register plugin creator -  ::LReLU_TRT version 1
E0307 06:49:53.383231 39223 helper.h:131] Could not register plugin creator -  ::MultilevelCropAndResize_TRT version 1
E0307 06:49:53.383244 39223 helper.h:131] Could not register plugin creator -  ::MultilevelProposeROI_TRT version 1
E0307 06:49:53.383255 39223 helper.h:131] Could not register plugin creator -  ::MultiscaleDeformableAttnPlugin_TRT version 1
E0307 06:49:53.383268 39223 helper.h:131] Could not register plugin creator -  ::NMSDynamic_TRT version 1
E0307 06:49:53.383277 39223 helper.h:131] Could not register plugin creator -  ::NMS_TRT version 1
E0307 06:49:53.383291 39223 helper.h:131] Could not register plugin creator -  ::Normalize_TRT version 1
E0307 06:49:53.383298 39223 helper.h:131] Could not register plugin creator -  ::PillarScatterPlugin version 1
E0307 06:49:53.383306 39223 helper.h:131] Could not register plugin creator -  ::PriorBox_TRT version 1
E0307 06:49:53.383317 39223 helper.h:131] Could not register plugin creator -  ::ProposalDynamic version 1
E0307 06:49:53.383325 39223 helper.h:131] Could not register plugin creator -  ::ProposalLayer_TRT version 1
E0307 06:49:53.383332 39223 helper.h:131] Could not register plugin creator -  ::Proposal version 1
E0307 06:49:53.383342 39223 helper.h:131] Could not register plugin creator -  ::PyramidROIAlign_TRT version 1
E0307 06:49:53.383348 39223 helper.h:131] Could not register plugin creator -  ::Region_TRT version 1
E0307 06:49:53.383360 39223 helper.h:131] Could not register plugin creator -  ::Reorg_TRT version 1
E0307 06:49:53.383369 39223 helper.h:131] Could not register plugin creator -  ::ResizeNearest_TRT version 1
E0307 06:49:53.383376 39223 helper.h:131] Could not register plugin creator -  ::ROIAlign_TRT version 1
E0307 06:49:53.383383 39223 helper.h:131] Could not register plugin creator -  ::RPROI_TRT version 1
E0307 06:49:53.383394 39223 helper.h:131] Could not register plugin creator -  ::ScatterND version 1
E0307 06:49:53.383405 39223 helper.h:131] Could not register plugin creator -  ::SeqLen2Spatial version 1
E0307 06:49:53.383416 39223 helper.h:131] Could not register plugin creator -  ::SpecialSlice_TRT version 1
E0307 06:49:53.383426 39223 helper.h:131] Could not register plugin creator -  ::SplitGeLU version 1
E0307 06:49:53.383436 39223 helper.h:131] Could not register plugin creator -  ::Split version 1
E0307 06:49:53.383447 39223 helper.h:131] Could not register plugin creator -  ::VoxelGeneratorPlugin version 1
W0307 06:49:55.479466 39223 place.cc:253] The `paddle::PlaceType::kCPU/kGPU` is deprecated since version 2.3, and will be removed in version 2.4! Please use `Tensor::is_cpu()/is_gpu()` method to determine the type of place.
[INFO] ultra_infer/runtime/runtime.cc(265)::CreatePaddleBackend	Runtime initialized with Backend::PDINFER in Device::GPU.
Using official model (PP-OCRv4_server_det), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-OCRv4_server_det_infer.tar ...
Downloading PP-OCRv4_server_det_infer.tar ...
[==================================================] 100.00%
Extracting PP-OCRv4_server_det_infer.tar
[==================================================] 100.00%
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
Only Paddle model is detected. Paddle model will be used by default.
Backend: tensorrt
Backend config: precision='FP32' dynamic_shapes={'x': [[1, 3, 160, 160], [1, 3, 640, 640], [1, 3, 1280, 1280]]}
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine	Start to building TensorRT Engine...
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(659)::BuildTrtEngine	TensorRT Engine is built successfully.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(661)::BuildTrtEngine	Serialize TensorRTEngine to local file /root/.paddlex/official_models/PP-OCRv4_server_det/trt_serialized_FP32.trt.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(672)::BuildTrtEngine	TensorRTEngine is serialized to local file /root/.paddlex/official_models/PP-OCRv4_server_det/trt_serialized_FP32.trt, we can load this model from the seralized engine directly next time.
[INFO] ultra_infer/runtime/runtime.cc(314)::CreateTrtBackend	Runtime initialized with Backend::TRT in Device::GPU.
Using official model (PP-OCRv4_server_rec), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/PP-OCRv4_server_rec_infer.tar ...
Downloading PP-OCRv4_server_rec_infer.tar ...
[==================================================] 100.00%
Extracting PP-OCRv4_server_rec_infer.tar
[==================================================] 100.00%
Only Paddle model is detected. Paddle model will be used by default.
Backend: paddle_infer
Backend config: cpu_num_threads=8 enable_mkldnn=True enable_trt=True trt_dynamic_shapes={'x': [[1, 3, 48, 160], [1, 3, 48, 320], [8, 3, 48, 640]]} trt_dynamic_shape_input_data=None trt_precision='FP16' enable_log_info=False
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(28)::BuildOption	Will inference_precision float32
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(73)::BuildOption	Will try to use tensorrt fp16 inference with Paddle Backend.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(79)::BuildOption	Detect that tensorrt cache file has been set to /root/.paddlex/official_models/PP-OCRv4_server_rec/trt_serialized.trt, but while enable paddle2trt, please notice that the cache file will save to the directory where paddle model saved.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(173)::BuildOption	Currently, Paddle-TensorRT does not support the new IR, and the old IR will be used.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(288)::InitFromPaddle	Start generating shape range info file.
--- Running analysis [ir_graph_build_pass]
--- Running analysis [ir_analysis_pass]
--- Running analysis [ir_params_sync_among_devices_pass]
--- Running analysis [adjust_cudnn_workspace_size_pass]
--- Running analysis [inference_op_replace_pass]
--- Running analysis [save_optimized_model_pass]
--- Running analysis [ir_graph_to_program_pass]
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(551)::GetDynamicShapeFromOption	x: the max shape = [8, 3, 48, 640], the min shape = [1, 3, 48, 160], the opt shape = [1, 3, 48, 320]
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(321)::InitFromPaddle	Finish generating shape range info file.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(323)::InitFromPaddle	Start loading shape range info file /root/.paddlex/official_models/PP-OCRv4_server_rec/shape_range_info.pbtxt to set TensorRT dynamic shape.
E0307 07:02:32.092159 39223 helper.h:131] Could not register plugin creator -  ::BatchedNMSDynamic_TRT version 1
E0307 07:02:32.092187 39223 helper.h:131] Could not register plugin creator -  ::BatchedNMS_TRT version 1
E0307 07:02:32.092206 39223 helper.h:131] Could not register plugin creator -  ::BatchTilePlugin_TRT version 1
E0307 07:02:32.092216 39223 helper.h:131] Could not register plugin creator -  ::Clip_TRT version 1
E0307 07:02:32.092226 39223 helper.h:131] Could not register plugin creator -  ::CoordConvAC version 1
E0307 07:02:32.092234 39223 helper.h:131] Could not register plugin creator -  ::CropAndResizeDynamic version 1
E0307 07:02:32.092244 39223 helper.h:131] Could not register plugin creator -  ::CropAndResize version 1
E0307 07:02:32.092252 39223 helper.h:131] Could not register plugin creator -  ::DecodeBbox3DPlugin version 1
E0307 07:02:32.092262 39223 helper.h:131] Could not register plugin creator -  ::DetectionLayer_TRT version 1
E0307 07:02:32.092269 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Explicit_TF_TRT version 1
E0307 07:02:32.092280 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Implicit_TF_TRT version 1
E0307 07:02:32.092286 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_ONNX_TRT version 1
E0307 07:02:32.092300 39223 helper.h:131] Could not register plugin creator -  ::EfficientNMS_TRT version 1
E0307 07:02:32.092314 39223 helper.h:131] Could not register plugin creator -  ::FlattenConcat_TRT version 1
E0307 07:02:32.092329 39223 helper.h:131] Could not register plugin creator -  ::fMHA_V2 version 1
E0307 07:02:32.092356 39223 helper.h:131] Could not register plugin creator -  ::fMHCA version 1
E0307 07:02:32.092373 39223 helper.h:131] Could not register plugin creator -  ::GenerateDetection_TRT version 1
E0307 07:02:32.092386 39223 helper.h:131] Could not register plugin creator -  ::GridAnchor_TRT version 1
E0307 07:02:32.092392 39223 helper.h:131] Could not register plugin creator -  ::GridAnchorRect_TRT version 1
E0307 07:02:32.092401 39223 helper.h:131] Could not register plugin creator -  ::GroupNorm version 1
E0307 07:02:32.092408 39223 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 1
E0307 07:02:32.092417 39223 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 2
E0307 07:02:32.092424 39223 helper.h:131] Could not register plugin creator -  ::LayerNorm version 1
E0307 07:02:32.092435 39223 helper.h:131] Could not register plugin creator -  ::LReLU_TRT version 1
E0307 07:02:32.092449 39223 helper.h:131] Could not register plugin creator -  ::MultilevelCropAndResize_TRT version 1
E0307 07:02:32.092458 39223 helper.h:131] Could not register plugin creator -  ::MultilevelProposeROI_TRT version 1
E0307 07:02:32.092474 39223 helper.h:131] Could not register plugin creator -  ::MultiscaleDeformableAttnPlugin_TRT version 1
E0307 07:02:32.092490 39223 helper.h:131] Could not register plugin creator -  ::NMSDynamic_TRT version 1
E0307 07:02:32.092505 39223 helper.h:131] Could not register plugin creator -  ::NMS_TRT version 1
E0307 07:02:32.092515 39223 helper.h:131] Could not register plugin creator -  ::Normalize_TRT version 1
E0307 07:02:32.092523 39223 helper.h:131] Could not register plugin creator -  ::PillarScatterPlugin version 1
E0307 07:02:32.092535 39223 helper.h:131] Could not register plugin creator -  ::PriorBox_TRT version 1
E0307 07:02:32.092552 39223 helper.h:131] Could not register plugin creator -  ::ProposalDynamic version 1
E0307 07:02:32.092563 39223 helper.h:131] Could not register plugin creator -  ::ProposalLayer_TRT version 1
E0307 07:02:32.092569 39223 helper.h:131] Could not register plugin creator -  ::Proposal version 1
E0307 07:02:32.092581 39223 helper.h:131] Could not register plugin creator -  ::PyramidROIAlign_TRT version 1
E0307 07:02:32.092592 39223 helper.h:131] Could not register plugin creator -  ::Region_TRT version 1
E0307 07:02:32.092624 39223 helper.h:131] Could not register plugin creator -  ::Reorg_TRT version 1
E0307 07:02:32.092634 39223 helper.h:131] Could not register plugin creator -  ::ResizeNearest_TRT version 1
E0307 07:02:32.092643 39223 helper.h:131] Could not register plugin creator -  ::ROIAlign_TRT version 1
E0307 07:02:32.092654 39223 helper.h:131] Could not register plugin creator -  ::RPROI_TRT version 1
E0307 07:02:32.092671 39223 helper.h:131] Could not register plugin creator -  ::ScatterND version 1
E0307 07:02:32.092681 39223 helper.h:131] Could not register plugin creator -  ::SeqLen2Spatial version 1
E0307 07:02:32.092689 39223 helper.h:131] Could not register plugin creator -  ::SpecialSlice_TRT version 1
E0307 07:02:32.092703 39223 helper.h:131] Could not register plugin creator -  ::SplitGeLU version 1
E0307 07:02:32.092715 39223 helper.h:131] Could not register plugin creator -  ::Split version 1
E0307 07:02:32.092723 39223 helper.h:131] Could not register plugin creator -  ::VoxelGeneratorPlugin version 1
[INFO] ultra_infer/runtime/runtime.cc(265)::CreatePaddleBackend	Runtime initialized with Backend::PDINFER in Device::GPU.
Using official model (SLANet_plus), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Connecting to https://paddle-model-ecology.bj.bcebos.com/paddlex/official_inference_model/paddle3.0rc0/SLANet_plus_infer.tar ...
Downloading SLANet_plus_infer.tar ...
[==================================================] 100.00%
Extracting SLANet_plus_infer.tar
[==================================================] 100.00%
Only Paddle model is detected. Paddle model will be used by default.
Backend: paddle_infer
Backend config: cpu_num_threads=8 enable_mkldnn=True enable_trt=False trt_dynamic_shapes={'x': [[1, 3, 32, 32], [1, 3, 64, 448], [8, 3, 488, 488]]} trt_dynamic_shape_input_data=None trt_precision='FP32' enable_log_info=False
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(28)::BuildOption	Will inference_precision float32
[INFO] ultra_infer/runtime/runtime.cc(265)::CreatePaddleBackend	Runtime initialized with Backend::PDINFER in Device::GPU.
INFO:     Started server process [39223]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: image, shape: [1, 3, 842, 595], The shape range before: min_shape=[1, 3, 128, 64], max_shape=[8, 3, 512, 256].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 128, 64], max_shape=[8, 3, 842, 595].
[WARNING] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(328)::Infer	TensorRT engine will be rebuilt once shape range information changed, this may take lots of time, you can set a proper shape range before loading model to avoid rebuilding process. refer https://github.com/PaddlePaddle/FastDeploy/blob/develop/docs/en/faq/tensorrt_tricks.md for more details.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(567)::BuildTrtEngine	[TrtBackend] Use FP16 to inference.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(572)::BuildTrtEngine	Start to building TensorRT Engine...
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(659)::BuildTrtEngine	TensorRT Engine is built successfully.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(661)::BuildTrtEngine	Serialize TensorRTEngine to local file /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt.
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(672)::BuildTrtEngine	TensorRTEngine is serialized to local file /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt, we can load this model from the seralized engine directly next time.
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
E0307 07:08:22.165284 58297 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
E0307 07:10:44.335050 58297 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
E0307 07:13:06.902433 58297 helper.h:131] 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
INFO:     :50725 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
The default value for `limit_type` is max, and cannot be set in PaddleX HPI.
INFO:     :51973 - "POST /layout-parsing HTTP/1.1" 200 OK
^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [39223]
λ 07e8af1fba50 /paddlex_git/PaddleX/test paddlex --serve --pipeline layout_parsing_ori.yaml --use_hpip
Using official model (PP-LCNet_x1_0_doc_ori), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Only Paddle model is detected. Paddle model will be used by default.
Backend: tensorrt
Backend config: precision='FP16' dynamic_shapes={'x': [[1, 3, 224, 224], [1, 3, 224, 224], [8, 3, 224, 224]]}
[INFO] ultra_infer/vision/common/processors/transform.cc(91)::FuseNormalizeHWC2CHW	Normalize and HWC2CHW are fused to NormalizeAndPermute  in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(157)::FuseNormalizeColorConvert	BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(719)::CreateTrtEngineFromOnnx	Detect serialized TensorRT Engine file in /root/.paddlex/official_models/PP-LCNet_x1_0_doc_ori/trt_serialized_FP16.trt, will load it directly.
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: x, shape: [8, 3, 224, 224], The shape range before: min_shape=[-1, 3, 224, 224], max_shape=[-1, 3, 224, 224].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[8, 3, 224, 224], max_shape=[8, 3, 224, 224].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: x, shape: [1, 3, 224, 224], The shape range before: min_shape=[8, 3, 224, 224], max_shape=[8, 3, 224, 224].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 224, 224], max_shape=[8, 3, 224, 224].
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(108)::LoadTrtCache	Build TensorRT Engine from cache file: /root/.paddlex/official_models/PP-LCNet_x1_0_doc_ori/trt_serialized_FP16.trt with shape range information as below,
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(111)::LoadTrtCache	Input name: x, shape=[-1, 3, 224, 224], min=[1, 3, 224, 224], max=[8, 3, 224, 224]

[INFO] ultra_infer/runtime/runtime.cc(314)::CreateTrtBackend	Runtime initialized with Backend::TRT in Device::GPU.
Using official model (UVDoc), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Only Paddle model is detected. Paddle model will be used by default.
Backend: tensorrt
Backend config: precision='FP16' dynamic_shapes={'image': [[1, 3, 128, 64], [1, 3, 256, 128], [8, 3, 512, 256]]}
[Paddle2ONNX] Opset version will change to 16 from 11
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(719)::CreateTrtEngineFromOnnx	Detect serialized TensorRT Engine file in /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt, will load it directly.
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: image, shape: [8, 3, 842, 595], The shape range before: min_shape=[-1, 3, -1, -1], max_shape=[-1, 3, -1, -1].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[8, 3, 842, 595], max_shape=[8, 3, 842, 595].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: image, shape: [1, 3, 128, 64], The shape range before: min_shape=[8, 3, 842, 595], max_shape=[8, 3, 842, 595].
[WARNING] ultra_infer/runtime/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 128, 64], max_shape=[8, 3, 842, 595].
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(108)::LoadTrtCache	Build TensorRT Engine from cache file: /root/.paddlex/official_models/UVDoc/trt_serialized_FP16.trt with shape range information as below,
[INFO] ultra_infer/runtime/backends/tensorrt/trt_backend.cc(111)::LoadTrtCache	Input name: image, shape=[-1, 3, -1, -1], min=[1, 3, 128, 64], max=[8, 3, 842, 595]

[INFO] ultra_infer/runtime/runtime.cc(314)::CreateTrtBackend	Runtime initialized with Backend::TRT in Device::GPU.
Using official model (PicoDet-L_layout_17cls), the model files will be automatically downloaded and saved in /root/.paddlex/official_models.
Only Paddle model is detected. Paddle model will be used by default.
Backend: paddle_infer
Backend config: cpu_num_threads=8 enable_mkldnn=True enable_trt=True trt_dynamic_shapes={'image': [[1, 3, 640, 640], [1, 3, 640, 640], [8, 3, 640, 640]], 'scale_factor': [[1, 2], [1, 2], [8, 2]]} trt_dynamic_shape_input_data={'scale_factor': [[2.0, 2.0], [1.0, 1.0], [0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67, 0.67]]} trt_precision='FP16' enable_log_info=False
[INFO] ultra_infer/vision/common/processors/transform.cc(44)::FuseNormalizeCast	Normalize and Cast are fused to Normalize in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(91)::FuseNormalizeHWC2CHW	Normalize and HWC2CHW are fused to NormalizeAndPermute  in preprocessing pipeline.
[INFO] ultra_infer/vision/common/processors/transform.cc(157)::FuseNormalizeColorConvert	BGR2RGB and NormalizeAndPermute are fused to NormalizeAndPermute with swap_rb=1
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(28)::BuildOption	Will inference_precision float32
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(73)::BuildOption	Will try to use tensorrt fp16 inference with Paddle Backend.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(79)::BuildOption	Detect that tensorrt cache file has been set to /root/.paddlex/official_models/PicoDet-L_layout_17cls/trt_serialized.trt, but while enable paddle2trt, please notice that the cache file will save to the directory where paddle model saved.
[WARNING] ultra_infer/runtime/backends/paddle/paddle_backend.cc(173)::BuildOption	Currently, Paddle-TensorRT does not support the new IR, and the old IR will be used.
[INFO] ultra_infer/runtime/backends/paddle/paddle_backend.cc(323)::InitFromPaddle	Start loading shape range info file /root/.paddlex/official_models/PicoDet-L_layout_17cls/shape_range_info.pbtxt to set TensorRT dynamic shape.
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0307 07:16:24.821790 64326 helper.h:131] Could not register plugin creator -  ::BatchedNMSDynamic_TRT version 1
E0307 07:16:24.821828 64326 helper.h:131] Could not register plugin creator -  ::BatchedNMS_TRT version 1
E0307 07:16:24.821836 64326 helper.h:131] Could not register plugin creator -  ::BatchTilePlugin_TRT version 1
E0307 07:16:24.821854 64326 helper.h:131] Could not register plugin creator -  ::Clip_TRT version 1
E0307 07:16:24.821861 64326 helper.h:131] Could not register plugin creator -  ::CoordConvAC version 1
E0307 07:16:24.821872 64326 helper.h:131] Could not register plugin creator -  ::CropAndResizeDynamic version 1
E0307 07:16:24.821878 64326 helper.h:131] Could not register plugin creator -  ::CropAndResize version 1
E0307 07:16:24.821890 64326 helper.h:131] Could not register plugin creator -  ::DecodeBbox3DPlugin version 1
E0307 07:16:24.821902 64326 helper.h:131] Could not register plugin creator -  ::DetectionLayer_TRT version 1
E0307 07:16:24.821911 64326 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Explicit_TF_TRT version 1
E0307 07:16:24.821923 64326 helper.h:131] Could not register plugin creator -  ::EfficientNMS_Implicit_TF_TRT version 1
E0307 07:16:24.821930 64326 helper.h:131] Could not register plugin creator -  ::EfficientNMS_ONNX_TRT version 1
E0307 07:16:24.821941 64326 helper.h:131] Could not register plugin creator -  ::EfficientNMS_TRT version 1
E0307 07:16:24.821959 64326 helper.h:131] Could not register plugin creator -  ::FlattenConcat_TRT version 1
E0307 07:16:24.821969 64326 helper.h:131] Could not register plugin creator -  ::fMHA_V2 version 1
E0307 07:16:24.821979 64326 helper.h:131] Could not register plugin creator -  ::fMHCA version 1
E0307 07:16:24.821986 64326 helper.h:131] Could not register plugin creator -  ::GenerateDetection_TRT version 1
E0307 07:16:24.822000 64326 helper.h:131] Could not register plugin creator -  ::GridAnchor_TRT version 1
E0307 07:16:24.822010 64326 helper.h:131] Could not register plugin creator -  ::GridAnchorRect_TRT version 1
E0307 07:16:24.822019 64326 helper.h:131] Could not register plugin creator -  ::GroupNorm version 1
E0307 07:16:24.822036 64326 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 1
E0307 07:16:24.822042 64326 helper.h:131] Could not register plugin creator -  ::InstanceNormalization_TRT version 2
E0307 07:16:24.822048 64326 helper.h:131] Could not register plugin creator -  ::LayerNorm version 1
E0307 07:16:24.822059 64326 helper.h:131] Could not register plugin creator -  ::LReLU_TRT version 1
E0307 07:16:24.822072 64326 helper.h:131] Could not register plugin creator -  ::MultilevelCropAndResize_TRT version 1
E0307 07:16:24.822079 64326 helper.h:131] Could not register plugin creator -  ::MultilevelProposeROI_TRT version 1
E0307 07:16:24.822089 64326 helper.h:131] Could not register plugin creator -  ::MultiscaleDeformableAttnPlugin_TRT version 1
E0307 07:16:24.822103 64326 helper.h:131] Could not register plugin creator -  ::NMSDynamic_TRT version 1
E0307 07:16:24.822110 64326 helper.h:131] Could not register plugin creator -  ::NMS_TRT version 1
E0307 07:16:24.822125 64326 helper.h:131] Could not register plugin creator -  ::Normalize_TRT version 1
E0307 07:16:24.822136 64326 helper.h:131] Could not register plugin creator -  ::PillarScatterPlugin version 1
E0307 07:16:24.822144 64326 helper.h:131] Could not register plugin creator -  ::PriorBox_TRT version 1
E0307 07:16:24.822155 64326 helper.h:131] Could not register plugin creator -  ::ProposalDynamic version 1
E0307 07:16:24.822162 64326 helper.h:131] Could not register plugin creator -  ::ProposalLayer_TRT version 1
E0307 07:16:24.822168 64326 helper.h:131] Could not register plugin creator -  ::Proposal version 1
E0307 07:16:24.822175 64326 helper.h:131] Could not register plugin creator -  ::PyramidROIAlign_TRT version 1
E0307 07:16:24.822185 64326 helper.h:131] Could not register plugin creator -  ::Region_TRT version 1
E0307 07:16:24.822194 64326 helper.h:131] Could not register plugin creator -  ::Reorg_TRT version 1
E0307 07:16:24.822206 64326 helper.h:131] Could not register plugin creator -  ::ResizeNearest_TRT version 1
E0307 07:16:24.822212 64326 helper.h:131] Could not register plugin creator -  ::ROIAlign_TRT version 1
E0307 07:16:24.822220 64326 helper.h:131] Could not register plugin creator -  ::RPROI_TRT version 1
E0307 07:16:24.822232 64326 helper.h:131] Could not register plugin creator -  ::ScatterND version 1
E0307 07:16:24.822240 64326 helper.h:131] Could not register plugin creator -  ::SeqLen2Spatial version 1
E0307 07:16:24.822247 64326 helper.h:131] Could not register plugin creator -  ::SpecialSlice_TRT version 1
E0307 07:16:24.822257 64326 helper.h:131] Could not register plugin creator -  ::SplitGeLU version 1
E0307 07:16:24.822269 64326 helper.h:131] Could not register plugin creator -  ::Split version 1
E0307 07:16:24.822276 64326 helper.h:131] Could not register plugin creator -  ::VoxelGeneratorPlugin version 1
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0307 07:16:26.947592 64326 place.cc:253] The `paddle::PlaceType::kCPU/kGPU` is deprecated since version 2.3, and will be removed in version 2.4! Please use `Tensor::is_cpu()/is_gpu()` method to determine the type of place.

@zhang-prog

@zhang-prog
Copy link
Collaborator

您好,第一次是会构建引擎,这是符合预期的。

您发现的问题是:使用同一张图片多次推理,每次都需要重新构建引擎,说明没有保留缓存。

我们解决的是这个问题,麻烦使用相同图片多推理看看会不会再次重新构建引擎。您现在只推理一次,我们看不出来是否生效哈

@Bobholamovic
Copy link
Member

您好,环境我都配置好了,然后运行paddlex --serve --pipeline layout_parsing --use_hpip,可以启动服务,然后我测试了一张图片,但还是出现和之前一样的情况,报了之前的错误~

Image

@zhang-prog

如果确定了是在新开的容器里执行了如下语句:

paddlex --install serving
paddlex --install hpi-gpu
# 或者手动安装那两个wheel包

在启动服务时,即使遇到形如

Error Code 3 ...

的错误,只要程序最终能够完成引擎构建(没有报错退出),可以忽略这些错误。这些错误是paddle框架调用TensorRT API进行优化时出现的,不一定影响最终结果。

另外建议测试时先使用默认的产线配置文件(即,不手动调整高性能推理配置),如果有其他需求可以在基本的情况测试成功后再尝试。

@Danee-wawawa
Copy link
Author

您好,刚刚第二次服务启动起来了,但是时间很久,大概15分钟左右才启动起来~这个正常不?
然后我再次用同一张大尺寸的图片测试,可以较快得到结果,应该是缓存有保存下来。

依旧在第二次服务里,我多次重复提交图片测试的时候,发现突然有一次测试卡住了,如下界面,一直卡在这里,快有6分钟左右了,这是怎么回事儿?

Image

@Bobholamovic
Copy link
Member

您好,刚刚第二次服务启动起来了,但是时间很久,大概15分钟左右才启动起来~这个正常不? 然后我再次用同一张大尺寸的图片测试,可以较快得到结果,应该是缓存有保存下来。

依旧在第二次服务里,我多次重复提交图片测试的时候,发现突然有一次测试卡住了,如下界面,一直卡在这里,快有6分钟左右了,这是怎么回事儿?

Image

15分钟才启动起来不太正常,建议提供一下日志,我们看看可能是什么原因。

测试卡住,从日志来看可能是在重新构建引擎。看起来你所使用的测试数据很可能具有多样变化的尺寸,并且尺寸范围和该产线常见的数据不太一样。对于这种情况,我建议可以修改产线配置文件,将动态形状修改到一个较大的范围,使得实际数据超出预设范围的情况尽可能少出现。

@Danee-wawawa
Copy link
Author

Danee-wawawa commented Mar 7, 2025

好的👌

另外,这样部署环境之后推理速度好像慢了很多,之前我测试一个53页的pdf文档解析,无高性能推理插件的服务化部署只需要76s,现在无高性能推理插件的服务化部署需要168s,有高性能推理插件的服务化部署需要147s。有高性能推理插件的速度提升也不明显。

我再核实一下,用同样的环境测试下。

另外下面这个info会有影响吗?
Image

@Bobholamovic
Copy link
Member

好的,无高性能推理插件的服务化部署变慢有些奇怪,因为我理解我们这次并没有对这部分进行修改;不使用高性能推理->使用高性能推理的响应时间从168s->147s我觉得是有可能的,比如这条产线的主要耗时的地方不在模型推理或与模型推理关联的前后处理,这样就可能出现这种情况;如果使用的limit_type就是max,那么可以忽略这条警告,如果不是的话,当前高性能推理插件不能支持max以外的limit_type

@Danee-wawawa
Copy link
Author

好的👌

@Danee-wawawa
Copy link
Author

您好,我换了一个服务器重新部署,但是它默认给我走到了CPU,想问下是什么原因,和显卡占用有关嘛?

Image

@zhang-prog

@changdazhou
Copy link
Collaborator

显卡驱动有正确安装吗

@Danee-wawawa
Copy link
Author

嗯嗯,正常安装了,我发现必须加个--gpu才行,即paddlex --serve --pipeline layout_parsing.yaml --use_hpip --gpu,但是很奇怪,我在另外一台机器上直接运行paddlex --serve --pipeline layout_parsing.yaml --use_hpip它是直接调用gpu的,这是怎么回事?

@Bobholamovic
Copy link
Member

Bobholamovic commented Mar 14, 2025 via email

@Danee-wawawa
Copy link
Author

好的,您说的环境主要包括哪些呢?

@Bobholamovic
Copy link
Member

主要是显卡型号、驱动版本以及CUDA版本

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants