重庆理工大学学报(自然科学) ›› 2023, Vol. 37 ›› Issue (5): 210-217.

• 信息·计算机 • 上一篇    下一篇

一种改进神经网络的苹果快速识别算法

曹志鹏,袁锐波,杨 肖,林红刚,朱 正   

  1. (昆明理工大学 机电工程学院,昆明 650504)
  • 出版日期:2023-06-21 发布日期:2023-06-21
  • 作者简介:曹志鹏,男,硕士研究生,主要从事机器视觉与人工智能研究,Email:1476580651@qq.com;通信作者 袁锐波, 男,博士,教授,主要从事流体传动与控制研究,Email:22911979@qq.com。

A fast apple recognition algorithm based on improved neural networks

  • Online:2023-06-21 Published:2023-06-21

摘要: 针对采摘机器人设备计算力不足,目标检测速度受限,难以满足实时应用,提出一 种基于改进 YOLOv4的轻量化算法,用于提高检测速度和减少网络体积。采用轻量化的主干网 络 Ghostnet替代 YOLOv4中的 CSPdarknet53主干网络,减少了参数量。在替换主干网络的基础 上,再采用深度可分离卷积替换 YOLOv4中的颈部网络,进一步减少了权重和计算量。随后在 空间金字塔池化的前后增加 CBL卷积模块层数,将 3层更换为 5层,可以提高对图片的特征提 取和整个网络对图片信息的获取,提升精准度。采用 KNN聚类算法计算先验框,对先验框进行 预测,同时使用马赛克数据增强识别精度。苹果检测结果表明,修改后的网络对苹果有较好的 识别精度,在检测速度上比 YOLOv4提高 45.8%,FPS达到了 35,整体网络的权重减少 79.7%。 修改后的网络提高了检测速度,减少了权重文件大小,能更好地适用于计算力不足和储存空间 较小的采摘机器人设备。

关键词: YOLOv4, KNN聚类, Ghostnet, 空间金字塔池化

Abstract:

Aiming at the problems of insufficient computing power, limited target detection speed and difficulty in meeting real-time application of edge equipment of a picking robot, this paper proposes a lightweight algorithm based on improved YOLOv4, which improves detection speed and reduces network volume, and can be better applied to edge equipment. In this paper, the lightweight backbone network Ghostnet is used to replace the CSPdarknet53 backbone network in YOLOv4. Compared with YOLOv4, it has fewer parameters and is lighter in weight, which is proposed by Huawei. When the features with common convolution are being extracted, some of the same features are merged with the network after convolution, thus reducing the amount of computation without reducing the accuracy.

On the basis of replacing the backbone network, the deep separable convolution is used to replace the convolutional block of the neck network in YOLOv4. The deep separable convolution is further optimized by being divided into two simple steps and reducing the weight and computation amount. Although the previous modified network has improved the detection speed, the accuracy has decreased. In order to improve the accuracy without greatly increasing the calculation amount and weight, the number of layers of CBL convolutional module increases before and after the space pyramid pooling, and all the three layers are replaced with five layers to increase the ability of information extraction.

In addition, better information extraction of the feature map at the end of the backbone network can improve the accuracy, so the feature extraction of the image and the information acquisition of the whole network in the image are required with the aim to improve the accuracy. In order to further improve the accuracy, KNN clustering algorithm is used to calculate the prior box for prediction so as to make better preparation for the subsequent training. The more similar the prior box is to the target box, the more accurate the network will be after training. Meanwhile, Mosaic data are used to enhance the recognition accuracy. The detection results of apple show that the modified network has a better recognition accuracy: compared with YOLOv4, the detection speed increases by 45.8%, the FPS reaches 35 frames, and the weight of the whole network reduces by 79.7%. Compared with the other algorithms introduced in this paper, the modified network is better in both accuracy and prediction speed. In comparison with the effect of the picture, the accuracy of the modified network frame selection is more accurate than that of the other networks, and the missed targets are also fewer than those of the other networks. In view of the above performance, the overall performance of the modified network is better than that of the original YOLOv4 and the other networks. The modified network improves the detection speed and reduces the weight file size, and can be better applied to the edge equipment such as picking robots with insufficient computing power and small storage space.

中图分类号: 

  • TP316.6