Human versus machine: Can generative AI anticipate insect biological control outcomes?

Human versus machine: Can generative AI anticipate insect biological control outcomes?
Generative artificial intelligence (AI) could transform evidence synthesis and revolutionize the global scientific enterprise, yet its agricultural applications are understudied. Here, we systematically assess the performance of three web-grounded AI engines (ChatGPT, ScholarAI and DeepSeek) in synthesizing the global literature on biological control of the fall armyworm Spodoptera frugiperda, and benchmark their outputs against a recent, near-exhaustive human review. Though all engines rapidly screened vast literature corpora, they exhibited shortcomings in factual accuracy, reporting reliability and data consistency. In machine-run syntheses, natural enemy prevalence and performance data often diverged from published records while the level of agreement in enumerating top-performing taxa was evenly low. Meanwhile, internal consistency between laboratory and field-level parasitism data for ScholarAI and DeepSeek was similar to that in human-run reviews. All models tended towards faulty data extrapolation, hallucination and data fabrication, and a sporadic exclusion of key species. While autonomous, machine-only efforts accurately capture coarse-grained patterns in natural enemy identity, abundance, and impacts, they carry limited utility for (living) evidence syntheses or rigorous decision-support. Yet, handled with prudence and due human oversight, machine power might eventually revitalize underfunded disciplines and advance nature-friendly farming.

This work is licensed under CC-BY 4.0
DOI:
https://doi.org/10.1016/j.compag.2025.111317
Altmetric score:
Dimensions Citation Count:


Export citation: