Backbone swapping into YOLOv5 (DarkNet, MobileNet, ResNet, VGG) shows expected speed/accuracy tradeoffs for pea crop vs weeds: MobileNetv1 gives highest AP at IoU0.5 but worse localization at stricter IoUs while YOLOv5 (CSPDarkNet53mod) is the most stable across IoU thresholds and fastest-stable choice for embedded weeding (results and numbers taken directly from the paper)
Key numeric evidence: MobileNetv1 AP50 0.893, YOLOv5 AP50 0.881, VGG16 AP50 0.946, inference speeds 4.7 5.9 10 29 32 ms range (table values)
Source: peer article evaluation and result tables
Primary source for these statements and all numeric results is the paper itself (values below are taken from its tables and figures)
| Model | AP50 | AP75 | AP85 | AP90 | Speed ms |
|---|---|---|---|---|---|
| YOLOv5 | 0.881 | 0.874 | 0.852 | 0.798 | 10 |
| MobileNetv1 | 0.893 | 0.862 | 0.752 | 0.593 | 10.1 |
| VGG16 | 0.946 | 0.935 | 0.898 | 0.830 | 32.5 |
| YOLOv3 | 0.915 | 0.912 | 0.898 | 0.871 | 28.4 |
| YOLOv4 | 0.906 | 0.890 | 0.859 | 0.818 | 29 |
These numbers are transcribed from the paper tables and support the authors conclusions about tradeoffs (exact table provenance in the paper)
All above points are supported by the authors methods and tables
Most important reproducibility gaps
Comparable applied papers replacing YOLO backbones for speed/weight tradeoffs or improving small-object detection confirm the field trend: lightweight backbones like MobileNet variants frequently increase speed and sometimes improve coarse AP (AP50) but can lose localization quality at stricter IoU thresholds; multi-scale attention / neck improvements (as in RICE-YOLO) often yield large mAP gains for small objects in UAV imagery.
This combined AP50 vs speed plot visually highlights the speed/accuracy tradeoff reported by the authors: VGG models give highest AP50 but are slow; MobileNet/YOLO-tiny/YOLOv5 occupy lower-latency region with reasonable AP50; MobileNetv1's AP50 advantage over YOLOv5 is small but its localization at higher IoU degrades per tables
Conclusion: The paper credibly demonstrates the expected engineering tradeoffs: heavyweight backbones (VGG/ResNet/YOLOv3/4) give higher AP and more stable localization but at large compute cost; MobileNetv1 yields highest AP50 in this pea dataset but loses localization precision at stricter IoU thresholds; YOLOv5 (CSPDarkNet53mod) provides the best balance of speed and stability for embedded weed-removal systems in the authors dataset and conditions
Confidence in that conclusion: moderate (approx 6/10) because the numeric evidence is consistent and plausible but the very small independent test set (n=54) and lack of public dataset/code reduce confidence in generalizability to other fields, crops, and conditions.
What would disprove this conclusion: rigorous external evaluation on an expanded, independently collected multi-environment dataset showing MobileNetv1 or YOLOv5 failing to replicate reported AP50/AP75 patterns or showing different backbone/neck combos (e.g. lightweight backbone plus attention neck) outperforming the reported best combos on real embedded hardware; or demonstration that AP changes are within noise due to the tiny test set (confidence intervals overlapping).
Click to start an iterative BGPT agent to re-train/benchmark the reported backbones on supplied images, compute CI for APs, and propose an optimized backbone+neck for embedded deployment.
Custom summaries of the latest cutting edge Science research. Every Friday. No Ads.