Evaluation metrics in medical imaging AI: fundamentals, pitfalls, misapplications, and recommendations

Kocak, Burak; Klontzas, Michail E.; Stanzione, Arnaldo; Meddeb, Aymen; Demircioğlu, Aydın; Bluethgen, Christian; Bressem, Keno K.; Ugga, Lorenzo; Mercaldo, Nathaniel; Díaz, Oliver; Cuocolo, Renato

Evaluation metrics in medical imaging AI: fundamentals, pitfalls, misapplications, and recommendations

dc.contributor.author	Kocak, Burak
dc.contributor.author	Klontzas, Michail E.
dc.contributor.author	Stanzione, Arnaldo
dc.contributor.author	Meddeb, Aymen
dc.contributor.author	Demircioğlu, Aydın
dc.contributor.author	Bluethgen, Christian
dc.contributor.author	Bressem, Keno K.
dc.contributor.author	Ugga, Lorenzo
dc.contributor.author	Mercaldo, Nathaniel
dc.contributor.author	Díaz, Oliver
dc.contributor.author	Cuocolo, Renato
dc.date.accessioned	2026-03-04T12:03:38Z
dc.date.available	2026-03-04T12:03:38Z
dc.date.issued	2025-09
dc.date.updated	2026-03-04T12:03:38Z
dc.description.abstract	Robust assessment of artificial intelligence (AI) models in medical imaging is paramount for reliable clinical integration. This international collaborative review paper provides an overview of key evaluation metrics across diverse tasks, including classification, regression, survival analysis, detection, and segmentation, as well as specialized metrics for calibration, foundation models, large language models, and synthetic images. Challenges of comparing models statistically and translating metric scores to clinical practice are also discussed. For each section, the paper outlines fundamental metrics, identifies common pitfalls and misapplications, and offers recommendations for more robust evaluations. Key recommendations often involve utilizing multiple, complementary metrics tailored to the specific task and dataset properties, transparent reporting of methodology, and critically, considering the clinical utility and real-world implications of model performance. Ultimately, effective evaluation requires a comprehensive, context-aware approach that goes beyond statistical metrics to ensure.
dc.format.extent	24 p.
dc.format.mimetype	application/pdf
dc.identifier.idgrec	766730
dc.identifier.uri	https://hdl.handle.net/2445/227851
dc.language.iso	eng
dc.publisher	Elsevier B.V.
dc.relation.isformatof	Reproducció del document publicat a: https://doi.org/10.1016/j.ejrai.2025.100030
dc.relation.ispartof	European Journal of Radiology Artificial Intelligence, 2025, vol. 3, p. 100030
dc.relation.uri	https://doi.org/10.1016/j.ejrai.2025.100030
dc.rights	cc-by (c) Burak Kocak et al., 2025
dc.rights.accessRights	info:eu-repo/semantics/openAccess
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.source	Articles publicats en revistes (Matemàtiques i Informàtica)
dc.subject.classification	Intel·ligència artificial en medicina
dc.subject.classification	Diagnòstic per la imatge
dc.subject.classification	Aprenentatge automàtic
dc.subject.classification	Algorismes computacionals
dc.subject.other	Medical artificial intelligence
dc.subject.other	Diagnostic imaging
dc.subject.other	Machine learning
dc.subject.other	Computer algorithms
dc.title	Evaluation metrics in medical imaging AI: fundamentals, pitfalls, misapplications, and recommendations
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1

Nom:: 922437.pdf
Mida:: 13.73 MB
Format:: Adobe Portable Document Format

Descarregar

Col·leccions

Articles publicats en revistes (Matemàtiques i Informàtica)