河南农业科学 ›› 2024, Vol. 53 ›› Issue (1): 152-161.DOI: 10.15933/j.cnki.1004-3268.2024.01.017

• 农业信息与工程·农产品加工 • 上一篇    下一篇

基于改进YOLOX 的自然环境下核桃识别算法研究

钟正扬1,2,云利军1,2,杨璇玺3,陈载清1,2   

  1. (1.云南师范大学信息学院,云南 昆明 650500;2. 云南省教育厅计算机视觉与智能控制技术工程研究中心,云南 昆明 650500;3. 云南省林业调查规划院生态分院卫星林业应用中心,云南 昆明 650500)
  • 收稿日期:2023-08-10 出版日期:2024-01-15 发布日期:2024-02-27
  • 通讯作者: 云利军(1973-),男,内蒙古呼和浩特人,教授,博士,主要从事视频图像处理、物联网技术研究。E-mail:yunlijun@ynnu.edu.cn
  • 作者简介:钟正扬(1999-),男,安徽马鞍山人,在读硕士研究生,研究方向:视频图像处理。E-mail:zhongzheng11111@163.com
  • 基金资助:
    云南省教育厅科学研究基金项目(2023Y0533)

Research on Walnut Recognition Algorithm in Natural Environment Based on Improved YOLOX

ZHONG Zhengyang1,2,YUN Lijun1,2,YANG Xuanxi3,CHEN Zaiqing1,2   

  1. (1.School of Information,Yunnan Normal University,Kunming 650500,China;2. Yunnan Provincial Department of Education Computer Vision and Intelligent Control Technology Engineering Research Center,Kunming 650500,China;3. Satellite Forestry Application Center,Ecological Branch of Yunnan Forestry Survey and Planning Institute,Kunming 650500,China)
  • Received:2023-08-10 Published:2024-01-15 Online:2024-02-27

摘要: :针对现有目标检测算法对自然环境下核桃识别存在漏检、误检等问题,提出了一种基于Swin Transformer 多层特征融合改进的YOLOX-S 核桃识别算法。首先,在主干特征提取网络中引入基于Swin Transformer的多层特征融合模块,借助Swin Transformer的多头注意力机制对小目标的特征信息进 行提取并与特征图进行融合,可以有效解决因网络层数加深导致的高层特征图中小目标特征信息丢失问题;其次,为了提高算法的检测精度,引入更高效的Repblock模块对原网络中的CSP模块进行替换;最后,为了提高下采样效果,使用更为优秀的Transition Block模块作为主干特征提取网络的下采样模块。结果表明,改进后的YOLOX-S模型在采集的自然环境下核桃数据集上平均精度AP50达到96.72%,分别比Faster-RCNN、YOLOv5-S、YOLOX-S算法提高7.36、1.38、0.62百分点,检测速度达到46 f/s,模型参数大小为20.55 M。改进后的YOLOX-S算法具有更好的精度,改善了漏检和误检问题,对自然环境下的核桃有更好的识别效果。

关键词: 核桃识别, Swin Transformer, 多层特征融合模块, YOLOX-S, 深度学习

Abstract: Aiming to address the issues of missed detection and false detection of walnut recognition in natural environments using existing target detection algorithms,we proposed an improved YOLOX‑S walnut recognition algorithm based on Swin Transformer multi‑layer feature fusion. First of all,a multi‑layer feature fusion module based on Swin Transformer was introduced into the backbone feature extraction network,and the multi‑head attention mechanism of Swin Transformer was used to extract the feature information of small targets and fuse them with feature maps,which could effectively resolve the issue of losing feature information related to smaller targets within the higher‑level feature map as a result of deepening network layers. Secondly,to enhance the detection accuracy of the algorithm,we introduced a more efficient Repblock module to replace the CSP module in the original network. Finally,to enhance the down‑sampling effect,we employed the Transition Block module as the down‑sampling module of the backbone feature extraction network. The results showed that the improved YOLOX‑S algorithm demonstrated an average accuracy of 96.72% on the walnut datasets,which was higher than the accuracy achieved by the Faster R‑CNN,YOLOv5‑S,and YOLOX‑S algorithms,with improvements of 7.36,1.38,and 0.62 percentage points respectively. The detection speed of the algorithm reached 46 f/s,while the model parameter size was 20.55 M. The improved YOLOX‑S algorithm exhibited superior average precision,thereby addressing the issues of missed detection and false detection effectively. It had a better recognition effect on walnuts in the natural environment.

Key words: Walnut detection, Swin Transformer, Multi?layer feature fusion module, YOLOX?S, Deep learning

中图分类号: