Plant Phenomics

Research Article | Open Access

Volume 2024 |Article ID 0220 | https://doi.org/10.34133/plantphenomics.0220

A Multi-Modal Open Object Detection Model for Tomato Leaf Diseases with Strong Generalization Performance Using PDC-VLD

Jinyang Li,¹ Fengting Zhao,¹ Hongmin Zhao,¹ Guoxiong Zhou ,¹ Jiaxin Xu,¹ Mingzhou Gao,² Xin Li,³ Weisi Dai,¹ Honliang Zhou,¹ Yahui Hu,⁴ and Mingfang He¹

¹Central South University of Forestry and Technology, Changsha 410004, Hunan, China
²Inner MongoliaAgriculture University, Hohhot 010010, Inner Mongolia Autonomous Region, China
³Inner MongoliaUniversity, Hohhot 010021, Inner Mongolia Autonomous Region, China
⁴Plant Protection Institute,Hunan Academy of Agricultural Sciences, Changsha 410125, Hunan, China

Received 15 Apr 2024	Accepted 01 Jul 2024	Published 13 Aug 2024

Abstract

Precise disease detection is crucial in modern precision agriculture, especially in ensuring the health of tomato crops and enhancing agricultural productivity and product quality. Although most existing disease detection methods have helped growers identify tomato leaf diseases to some extent, these methods typically target fixed categories. When faced with new diseases, extensive and costly manual annotation is required to retrain the dataset. To overcome these limitations, this study proposes a multimodal model PDC-VLD based on the open-vocabulary object detection (OVD) technology within the VLDet framework, which can accurately identify new tomato leaf diseases without manual annotation by using only image–text pairs. First, we developed a progressive visual transformer-convolutional pyramid module (PVT-C) that effectively extracts tomato leaf disease features and optimizes anchor box positioning using the self-supervised learning algorithm DINO, suppressing interference from irrelevant backgrounds. Then, a context feature guided module (CFG) was adopted to address the low adaptability and recognition accuracy of the model in data-scarce environments. To validate the model’s effectiveness, we constructed a tomato leaf disease image dataset containing 4 base classes and 2 new categories. Experimental results show that the PDC-VLD model achieved 61.2% on the main evaluation metric ${mAP}_{novel}^{50}$ , and 56.4% on ${mAP}_{novel}^{75}$ , 87.7% on ${mAP}_{base}^{50}$ , 81.0% on ${mAP}_{all}^{50}$ , and 45.5% on average recall, outperforming existing OVD models. Our research provides an innovative solution for efficiently and accurately detecting new diseases, substantially reducing the need for manual annotation, and offering critical technical support and practical reference for agricultural workers.

Research Article | Open Access

Volume 2024 |Article ID 0220 | https://doi.org/10.34133/plantphenomics.0220

A Multi-Modal Open Object Detection Model for Tomato Leaf Diseases with Strong Generalization Performance Using PDC-VLD

Jinyang Li,¹ Fengting Zhao,¹ Hongmin Zhao,¹ Guoxiong Zhou ,¹ Jiaxin Xu,¹ Mingzhou Gao,² Xin Li,³ Weisi Dai,¹ Honliang Zhou,¹ Yahui Hu,⁴ and Mingfang He¹

Abstract

Fulltext

PDF

Download Citation

Submit Manuscript

Research Article | Open Access

Volume 2024 |Article ID 0220 | https://doi.org/10.34133/plantphenomics.0220

A Multi-Modal Open Object Detection Model for Tomato Leaf Diseases with Strong Generalization Performance Using PDC-VLD

Jinyang Li,1 Fengting Zhao,1 Hongmin Zhao,1 Guoxiong Zhou ,1 Jiaxin Xu,1 Mingzhou Gao,2 Xin Li,3 Weisi Dai,1 Honliang Zhou,1 Yahui Hu,4 and Mingfang He1

Abstract

Fulltext

PDF

Download Citation

Submit Manuscript

Jinyang Li,¹ Fengting Zhao,¹ Hongmin Zhao,¹ Guoxiong Zhou ,¹ Jiaxin Xu,¹ Mingzhou Gao,² Xin Li,³ Weisi Dai,¹ Honliang Zhou,¹ Yahui Hu,⁴ and Mingfang He¹