重庆理工大学学报(自然科学) ›› 2023, Vol. 37 ›› Issue (3): 312-320.

• “第 23届流体动力与机电控制工程国际学术会议”专栏 • 上一篇    下一篇

改进 YOLOv4的蚕豆苗检测算法及 TensorRT加速

杨 肖,袁锐波,李兆旭   

  1. 昆明理工大学 机电工程学院,昆明 650504
  • 出版日期:2023-04-26 发布日期:2023-04-26
  • 作者简介::杨肖,硕士研究生,主要从事机器视觉与人工智能研究,Email:1037130235@qq.com;通信作者 袁锐波,男,博 士,教授,主要从事流体传动与控制研究,Email:22911979@qq.com。

The improved YOLOv4 algorithm for broad beansprout detection and TensorRT acceleration

  • Online:2023-04-26 Published:2023-04-26

摘要: 提出一种基于改进 YOLOv4网络的轻量化蚕豆苗检测方法,使用 MobileNet网络代 替 YOLOv4原主干网络 CSPDarknet53,并用深度可分离卷积替换骨干网络,加强特征提取网络 和预测层的普通卷积;改进网络训练后,利用 NVIDIA的加速引擎 TensorRT对网络结构进行重 构和优化,提高 GPU运行效率,在嵌入式平台上实现模型的推理与加速;实验结果显示:改进网 络体积缩小至原网络体积约 20%,AP仅下降 3.14%,但检测速度是原网络的 4倍。在 Jetson Nano嵌入式平台上,改进后的网络模型推理速度达到 20.3FPS;表明提出的网络模型能支持深 度学习模型在嵌入式平台的实时应用。

关键词: 目标检测, YOLOv4, TensorRT, 嵌入式平台

Abstract:  This paper proposes a lightweight broad bean sprout detection method based on the improved YOLOv4 network.The original backbone network CSPDarknet53 of YOLOv4 is replaced by the MobileNet network, and, after it is replaced by depthwise separable convolution, ordinary convolution of the feature extraction network and the prediction layer is strengthened. After being improved by network training, the network structure is reconstructed and optimized by using NVIDIA acceleration engine TensorRT, the GPU operation efficiency improves, and inference and acceleration of the model on the embedded platform are realized. The experimental results show that the improved network volume reduces to approximately 20% of the original volume, with a reduction of AP by only 3.14%, but the detection speed is 4 times that of the original network. On the Jetson Nano embedded platform, the inference speed of the improved network model reaches 20.3 FPS, which shows the proposed network model can contribute to the real-time application of the deep learning model on the embedded platform.

中图分类号: 

  • F326.4