河南农业科学 ›› 2024, Vol. 53 ›› Issue (3): 151-157.DOI: 10.15933/j.cnki.1004-3268.2024.03.016

• 农业信息与工程·农产品加工 • 上一篇    下一篇

基于注意力机制的轻量化YOLO v5s 蓝莓检测算法

刘拥民1,2,张炜1,2,麻海志1,2,刘原1,2,张毅1   

  1. (1.中南林业科技大学计算机与信息工程学院,湖南 长沙 410004;2.中南林业科技大学智慧林业云研究中心,湖南 长沙 410004)
  • 收稿日期:2023-12-06 出版日期:2024-03-15 发布日期:2024-04-19
  • 通讯作者: 张炜(1999-),男,湖南长沙人,在读硕士研究生,研究方向:人工智能、图像处理、信息安全。E-mail:20221200528@csuft.edu.cn
  • 作者简介:刘拥民(1971-),男,湖南株洲人,教授,博士,主要从事人工智能、物联网、云计算、5G/6G通信、网络性能评估研究。E-mail:T20040550@csuft.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(31870532);湖南省自然科学基金资助项目(2021JJ31163);湖南省教育科学“十三五”规划基
    金资助项目(XJK20BGD048)

Lightweight YOLO v5s Blueberry Detection Algorithm Based on Attention Mechanism

LIU Yongmin1,2,ZHANG Wei1,2,MA Haizhi1,2,LIU Yuan1,2,ZHANG Yi1   

  1. (1.School of Computer and Information Engineering,Central South University of Forestry and Technology,Changsha
    410004,China;2.Smart Forestry Cloud Research Center,Central South University of Forestry and Technology,Changsha
    410004,China)
  • Received:2023-12-06 Published:2024-03-15 Online:2024-04-19

摘要: 为实现自然环境下蓝莓的精确快速检测,在YOLO v5s的基础上提出了一种结合轻量级网络和注意力机制的改进算法。首先,在主干网络和检测头的位置去除了最大目标检测层的结构,因而降低模型的参数量,增强模型对小目标的检测能力。其次,将MHSA(Multi‐head self‐attention,多头自注意力)替换了SPPF(Spatial pyramid pooling‐fast,快速空间金字塔池化)前面的C3模块,使模型学习到更全面的特征表示,增强模型对蓝莓图像中复杂空间关系和上下文信息的理解能力。最后,在C3模块中加入了S-PSA(Sequential polarized self‐attention,顺序极化自注意力),以便模型能够更好地捕捉特征图中相邻区域之间的上下文依赖关系。结果表明,改进后的YOLO v5s算法对成熟、半成熟和未成熟蓝莓的检测精度分别提升1.2、4.4、2.6百分点,平均精度提升2.7百分点,模型参数量减少76.0%。与当前主流轻量化目标检测模型相比,改进后的模型性能更加优越,能为自然环境下蓝莓采摘机器人视觉系统提供一种有效的方案。

关键词: 蓝莓检测, YOLO v5s, 轻量级网络, 注意力机制, 多头自注意力

Abstract: To achieve precise and rapid detection of blueberries in natural environments,an improved algorithm combining lightweight networks and attention mechanisms was proposed based on YOLO v5s.Firstly,the structure of the maximum object detection layer was removed at the positions of the backbone network and detection heads,thereby reducing the number of model parameters and enhancing the model’s ability to detect small targets. Secondly,MHSA(Multi‐head self‐attention)was used to replace the C3 module before SPPF(Spatial pyramid pooling‐fast),enabling the model to learn more comprehensive feature representations and enhancing its understanding of complex spatial relationships and contextual information in blueberry images. Finally,S‐PSA(Sequential polarized self‐attention)was added to the C3 module to better capture the contextual dependencies between adjacent regions in the feature map. The experimental results showed that the improved YOLO v5s algorithm improved the detection accuracy of mature blueberries,semi mature blueberries,and immature blueberries by 1.2,4.4,2.6 percentage points,respectively,with average accuracy increase of 2.7 percentage points and 76% reduction in model parameter count. Compared with the current mainstream lightweight object detection models,the improved model has superior performance and can provide an effective solution for the visual system of blueberry picking robots in natural environments.

Key words: Blueberry detection, YOLO v5s, Lightweight network, Attention mechanism, MHSA

中图分类号: