基于对比学习的多肉植物分类识别方法研究

doi:10.15933/j.cnki.1004-3268.2023.07.016

摘要/Abstract

摘要： 针对多肉植物种类多，类内差异大、类间差异小，数据难收集，导致传统分类算法不能有效解决多肉植物图像分类的问题，提出一种基于对比学习的多肉植物图像分类网络CL_ConvNeXt。该网络以ConvNeXt为基础结构引入对比学习思想，在网络中间层添加非线性投影层（Projection head）作为辅助分类器来帮助模型对浅层网络进行特征提取；在一个批处理中通过数据增强来构造正样本，将剩余样本看作负样本；将交叉熵损失函数和对比损失函数进行加权计算，重新设计新的损失函数计算方法，实现单阶段模型训练。训练时采用迁移学习将预训练权重迁移到模型中来提高模型训练时的收敛速度，通过优化各种策略和参数来进一步提升模型的识别准确率。结果表明，在自制的190类多肉植物数据集中，在使用相同训练策略和环境配置的情况下，最终模型CL_ConvNeXt对多肉植物图像分类识别准确率达到了91.79%，较原ConvNeXt模型结构的识别准确率提升了12.24个百分点，对解决多肉植物图像分类识别问题有较好的效果。

关键词: 多肉植物, 图像分类, 对比学习, ConvNeXt, 投影层

Abstract: In view of the large variety of succulents，the large intra⁃class differences and the small inter⁃class differences，as well as the difficulty of data collection，traditional classification algorithms cannot effectively solve the problem of succulent plant image classification.This paper proposed a contrastive learning based succulent plant image classification network CL_ConvNeXt.The network was based on ConvNeXt structure and introduced the idea of contrastive learning.A non⁃linear projection layer（Projection head）was added in the middle layer of the network as an auxiliary classifier to help the model extract features from the shallow network.In a batch，positive samples were constructed through data augmentation，and the remaining samples were considered as negative samples.The cross entropy loss function and the contrastive loss function were weighted to newly design loss function calculation method，which could achieve one⁃stage model training.Transfer learning was used during training to transfer the pre⁃trained weights to the model to improve the convergence speed of the model，and various strategies and parameters were optimized to further improve the recognition accuracy of the model.The experimental results showed that on the self⁃made 190⁃class succulent plant dataset，under the same training strategy and environment configuration，the recognition accuracy of the final model CL_ConvNeXt for succulent plant image classification reached 91.79%，which was 12.24 percentage points higher than that of the original ConvNeXt model structure，showing good effect on solving the problem of succulent plant image classification and recognition.

Key words: Succulents, Image classification, Contrastive learning, ConvNeXt, Projection head

中图分类号:

S126

封雨欣, 梁少华, 童浩. 基于对比学习的多肉植物分类识别方法研究[J]. 河南农业科学, 2023, 52(7): 154-162.

FENG Yuxin, LIANG Shaohua, TONG Hao. Research on Succulent Plant Classification and Recognition Method Based on Contrastive Learning[J]. Journal of Henan Agricultural Sciences, 2023, 52(7): 154-162.

参考文献

［1］谢维荪.多肉植物的新范围与新分类［J］.中国花卉盆景，2012（6）：14⁃15.
XIE W S.New range and classification of succulents［J］.Chinese Flowers Bonsai，2012（6）：14⁃15．
［2］刘俨娇.基于深度卷积网的多肉植物图像分类技术研究［D］.大连：大连交通大学，2018.
LIU Y J.Image classification of succulents based on deep convolutional network ［D］.Dalian： Dalian Jiaotong University，2018.
［3］黄嘉宝，朱永华，周霁婷，等.基于卷积神经网络的多肉植物细粒度图像分类［J］.上海大学学报（自然科学版），2020，26（2）：283⁃291.

HUANG J B，ZHU Y H，ZHOU J T，et al.Fine⁃grained image classification of succulents based on convolutional neural networks［J］.Journal of Shanghai University（Natural Science），2020，26（2）：283⁃291.

［4］DYRMANN M，KARSTOFT H，MIDTIBY H S.Plant species classification using deep convolutional neural network［J］.Biosystems Engineering，2016，151：72⁃80.

［5］HU J，CHEN Z，YANG M，et al.A multi⁃scale fusion convolutional neural network for plant leaf recognition ［J］.IEEE Signal Processing Letters，2018，25（6）：853⁃857.

［6］LEE S H，CHAN C S，WILKIN P，et al.Deep⁃plant：Plant identification with convolutional neural networks ［C］//IEEE International Conference on Image Processing.Quebec City，QC，Canada：IEEE，2015：452⁃456.

［7］KUMAR N，BELHUMEUR P N，BISWAS A，et al.Leafsnap：A computer vision system for automatic plant species identification［C］//Proceedings of the 12th European Conference on Computer Vision.Berlin Heidelberg：Springer，2012：502⁃516.

［8］李立鹏，师菲蓬，田文博，等.基于残差网络和迁移学习的野生植物图像识别方法［J］.无线电工程，2021，51 （9）：857⁃863.

LI L P，SHI F P，TIAN W B，et al.Wild plant image recognition method based on residual network and transfer learning［J］.Radio Engineering，2021，51（9）：857⁃863.

［9］HE K，FAN H，WU Y，et al.Momentum contrast for unsupervised visual representation learning［C］//IEEE Conference on Computer Vision and Pattern Recog⁃nition. Seattle，WA，USA：IEEE，2020：9726⁃9735.

［10］CHEN T，KORNBLITH S，NOROUZI M，et al.A simple framework for contrastive learning of visual representations［EB/OL］.（2020⁃02⁃13）［2020⁃03⁃30］.https：//doi.org/10.48550/arXiv.2002.05709.

［11］SHORTEN C，KHOSHGOFTAAR T M.A survey on image data augmentation for deep learning［J］.Big Data，2019，6（1）：60⁃108.

［12］DEVRIES T，TAYLOR W.Improved regularization of convolutional neural networks with cutout［EB/OL］.（2017⁃08⁃15）［2017⁃11⁃29］.https：//doi. org/10. 48550/arXiv. 1708. 04552.

［13］ZHANG H Y，CISSE M，DAUPHIN Y N，et al.Mixup：Beyond empirical risk minimization ［EB/OL］.（2017⁃10⁃25）［2018⁃04⁃27］.https：//doi.org/10.48550/arXiv.1710. 09412.

［14］ZHUANG L，MAO H，WU C，et al.A ConvNet for the 2020s［EB/OL］.（2022⁃01⁃10）［2022⁃03⁃02］.https：//doi.org/10.48550/arXiv.2201.03545.

［15］XIE S，GIRSHICK R，DOLLAR P，et al.Aggregated residual transformations for deep neural networks［C］//IEEE Conference on Computer Vision and Pattern Recognition.Honolulu， HI，USA： IEEE，2017：5987⁃5995.

［16］LIU Z，LIN Y，CAO Y，et al.Swin transformer：Hierarchical vision transformer using shifted windows［C］//IEEE Conference on Computer Vision and Pattern Recognition.Montreal，QC，Canada：IEEE，2021：9992⁃10002.

［17］LE⁃KHAC P H，HEALY G，SMEATON A F.Contrastive representation learning：A framework and review［J］.IEEE Access，2020，8：193907⁃193934.
［18］魏花.基于卷积神经网络的细粒度图像识别关键技术分析与研究［D］.长春：中国科学院大学，2021.
WEI H.Analysis and research on key technologies of fine⁃grained image recognition based on convolutional neural network［D］.Changchun：University of Chinese Academy of Sciences，2021.
［19］RUBINSTEIN R Y.Optimization of computer simulation models with rare events［J］.European Journal of Operational Research，1997，99（1）：89⁃112.
［20］LOSHCHILOV I，HUTTER F.Decoupled weight decay regularization［EB/OL］.（2017⁃11⁃14）［2019⁃01⁃04］.https：//doi.org/10.48550/arXiv.1711.05101v3.
［21］KINGMA D，BA J.Adam：A method for stochastic optimization［EB/OL］.（2014⁃12⁃22）［2015⁃06⁃23］.https：//doi.org/10.48550/arXiv.1412.6980v6.
［22］HE K，ZHANG X Y，REN S Q，et al.Deep residual learning for image recognition［C］//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas，NV，USA：IEEE，2016：770⁃778.
［23］LOSHCHILOV I，HUTTER F.SGDR：Stochastic gradient descent with warm restarts［EB/OL］.（2016⁃08⁃13）
［2017⁃02⁃23］.https：//doi.org/10.48550/arXiv.1608.03983v3.
［24］DOSOVITSKIY A，BEYER L，KOLESNIKOV A，et al.An image is worth 16x16 words：Transformers for image recognition at scale［EB/OL］.（2020⁃10⁃22）［2021⁃06⁃03］.https：//doi.org/10.48550/arXiv.2010.11929.
［25］HE T，ZHANG Z，ZHANG H，et al.Bag of Tricks for image classification with convolutional neural networks［C］//IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Long Beach，CA，USA：IEEE，2019：558⁃567.
［26］HUANG G，SUN Y.Deep networks with stochastic depth ［C］//Computer Vision⁃ECCV 2016. Cham：Springer International Publishing，2016：646⁃661.
［27］NILSBACK M E，ZISSERMAN A.Automated flower classification over a large number of classes［C］//Sixth Indian Conference on Computer Vision，Graphics&Image Processing. Bhubaneswar，India：IEEE，2008：16⁃19.
［28］WELINDER P，BRANSON S，MITA T，et al.Caltech⁃UCSD birds 200［J/OL］.California Institute of Technology，2010 ［2023⁃02⁃10］.https：//authors.library.caltech. edu/27452/.

［29］ADITYA K，NITYANANDA J，YAO B P，et al.Novel dataset for fine⁃grained image categorization：stanford dogs［C］//IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Colorado Springs，CO，USA：IEEE，2011.

[1]	阮文晓, 傅敏杰, 朱月清, 王云霞, 庄道山, 郑文, 张萼, 朱志玉, 朱祝军, 吴建国. UV-B 多光谱补光对多肉植物观赏特性与理化性质的影响[J]. 河南农业科学, 2022, 51(6): 126-133.
[2]	李素华, 韩浩章, 蒋亚华, 张丽华, 王晓立, 张先进. 基于层次分析法的景天科多肉植物品种评价体系构建与应用[J]. 河南农业科学, 2020, 49(8): 101-108.
[3]	林中琦，牟少敏，时爱菊，孙肖肖，李磊. 基于Spark的支持向量机在小麦病害图像识别中的应用[J]. 河南农业科学, 2017, 46(7): 148-153.