1College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao266590, China
2Agricultural Information Institute of CAAS, Beijing 100081, China
3National ClimateCenter, Beijing 100081, China
Received 24 May 2024 |
Accepted 12 Nov 2024 |
Published 16 Dec 2024 |
In contemporary agriculture, experts develop preventative and remedial strategies for various disease stages in diverse crops. Decision-making regarding the stages of disease occurrence exceeds the capabilities of single-image tasks, such as image classification and object detection. Consequently, research now focuses on training visual question answering (VQA) models. However, existing studies concentrate on identifying disease species rather than formulating questions that encompass crucial multiattributes. Additionally, model performance is susceptible to the model structure and dataset biases. To address these challenges, we construct the informed-learning-guided VQA model of crop disease (ILCD). ILCD improves model performance by integrating coattention, a multimodal fusion model (MUTAN), and a bias-balancing (BiBa) strategy. To facilitate the investigation of various visual attributes of crop diseases and the determination of disease occurrence stages, we construct a new VQA dataset called the Crop Disease Multi-attribute VQA with Prior Knowledge (CDwPK-VQA). This dataset contains comprehensive information on various visual attributes such as shape, size, status, and color. We expand the dataset by integrating prior knowledge into CDwPK-VQA to address performance challenges. Comparative experiments are conducted by ILCD on the VQA-v2, VQA-CP v2, and CDwPK-VQA datasets, achieving accuracies of 68.90%, 49.75%, and 86.06%, respectively. Ablation experiments are conducted on CDwPK-VQA to evaluate the effectiveness of various modules, including coattention, MUTAN, and BiBa. These experiments demonstrate that ILCD exhibits the highest level of accuracy, performance, and value in the field of agriculture. The source codes can be accessed at https://github.com/SdustZYP/ILCD-master/tree/main.