Journal of Henan Agricultural Sciences ›› 2025, Vol. 54 ›› Issue (8): 167-180.DOI: 10.15933/j.cnki.1004-3268.2025.08.017

• Agricultural Information and Engineering and Agricultural Product Processing • Previous Articles    

Research on County‑Level Yield Simulation of Winter Wheat in Henan Province Based on Machine Learning Algorithms

LIU Xinglin,LIU Yuan,YANG Fan,LIU Buchun,HAN Rui   

  1. (Institute of Environment and Sustainable Development in Agriculture,Chinese Academy of Agricultural Sciences/State Engineering Laboratory of Efficient Water Use of Crops and Disaster Mitigation,Beijing 100081,China)
  • Received:2024-11-25 Accepted:2025-01-15 Published:2025-08-15 Online:2025-08-18

基于机器学习算法的河南省县级冬小麦产量模拟研究

刘星麟,刘园,杨凡,刘布春,韩锐   

  1. (中国农业科学院 农业环境与可持续发展研究所/作物高效用水与抗灾减损国家工程实验室,北京 100081)
  • 通讯作者: 刘园(1983-),女,天津人,副研究员,博士,主要从事农业气象灾害风险评估研究。E-mail:liuyuan@caas.cn 韩锐(1990-),男,山东德州人,助理研究员,博士,主要从事农业气象灾害风险评估研究。E-mail:hanrui@caas.cn
  • 作者简介:刘星麟(2000-),男,山东临沂人,在读硕士研究生,研究方向:农业气象灾害。E-mail:853409299@qq.com
  • 基金资助:
    中国农业科学院重大任务“粮食产能持续提升战略研究”项目(CAAS-ZDRW202419)

Abstract: Henan is a major province for winter wheat cultivation,and simulating winter wheat yield is of great significance for ensuring national food security. This study analyzed the performance of machine learning models in winter wheat yield simulation using ten‑day scale meteorological data and county‑level winter wheat yield data from 16 counties(cities)in Henan Province from 2000 to 2019.The dataset was divided into a test set(2000—2015) and a training set(2016—2019).Based on multiple stepwise regression,random forest,and random forest OOB methods,county‑level yield simulation models for winter wheat in Henan Province were constructed,and the simulation effects of different models were verified and compared. The results showed that,from 2000 to 2019,the winter wheat yield in Henan Province fluctuated between 2 001 and 7 980 kg/ha,with an average of 5 675 kg/ha and a coefficient of variation ranging from 3.75% to 26.58%.A multiple stepwise regression model was constructed based on 19 ten‑day scale meteorological factors that passed the 95% significance test.The multiple stepwise regression model was validated with a determination coefficient(R2)of 0.620 9 and a root mean square error(RMSE)of 907.06 kg/ha;The random forest model constructed using all the characteristic factors was validated with the R2 of 0.772 5,and the RMSE of 664.36 kg/ha.A total of 68 key ten‑day scale meteorological characteristic factors were screened based on random forest OOB importance analysis,among which,the ten‑day scale meteorological factors in November last year,March,April and June had particularly significant impacts on winter wheat yield.The validation determination coefficient of the random forest OOB model for simulating county‑level winter wheat yield was 0.860 5,and the RMSE was 636.58 kg/ha.The random forest OOB model performed better than the multiple stepwise regression model and the random forest model,with R2 increased by 38.59% and 11.39%,respectively,and RMSE decreased by 29.82% and 4.18%,respectively.This study utilized limited meteorological data and county‑level yield data to achieve reliable and accurate winter wheat yield simulation,providing a methodological reference for regional winter wheat yield simulation.

Key words: Winter wheat, Yield prediction, Random forest, Meteorological factors, OOB analysis, Ten?day scale

摘要: 河南省是冬小麦种植大省,精准评估冬小麦产量对保障国家粮食安全意义重大。基于2000—2019年河南省16个县(市)的旬尺度气象数据与河南省县级冬小麦产量数据,分析机器学习模型在冬小麦产量模拟中的性能。将数据集分为测试集(2000—2015年)与训练集(2016—2019年),基于多元逐步回归、随机森林和随机森林OOB方法,构建河南省县级冬小麦产量模拟模型,并对比验证不同模型模拟效果。结果表明,2000—2019年河南省县级冬小麦产量在2 001~7 980 kg/hm2波动,平均值为5 675 kg/hm2,变异系数区间为3.75%~26.58%。基于通过95%显著性检验的19个旬尺度气象因子构建多元逐步回归模型,验证R2 为0.620 9,RMSE为907.06 kg/hm2;使用全部特征因子构建的随机森林模型验证R2 为0.772 5,RMSE 为664.36 kg/hm2。基于随机森林OOB重要性分析,共筛选68个关键旬尺度气象特征因子,其中,11月、3月、4月、6月的旬尺度气象因子对冬小麦产量的影响尤为显著。随机森林OOB模型模拟县级冬小麦产量的验证R2为0.860 5,RMSE为636.58 kg/hm2。随机森林OOB模型表现优于多元逐步回归模型与随机森林模型,R2分别提高38.59%和11.39%,RMSE分别降低29.82%和4.18%。利用有限的气象数据和县尺度产量数据,实现了可靠且较高精度的冬小麦产量模拟,为区域冬小麦产量模拟提供了方法参考。

关键词: 冬小麦, 产量模拟, 随机森林, 气象因子, OOB分析, 旬尺度

CLC Number: