Evaluation metrics in medical imaging AI: fundamentals, pitfalls, misapplications, and recommendations

dc.contributor.authorKocak, Burak
dc.contributor.authorKlontzas, Michail E.
dc.contributor.authorStanzione, Arnaldo
dc.contributor.authorMeddeb, Aymen
dc.contributor.authorDemircioğlu, Aydın
dc.contributor.authorBluethgen, Christian
dc.contributor.authorBressem, Keno K.
dc.contributor.authorUgga, Lorenzo
dc.contributor.authorMercaldo, Nathaniel
dc.contributor.authorDíaz, Oliver
dc.contributor.authorCuocolo, Renato
dc.date.accessioned2026-03-04T12:03:38Z
dc.date.available2026-03-04T12:03:38Z
dc.date.issued2025-09
dc.date.updated2026-03-04T12:03:38Z
dc.description.abstractRobust assessment of artificial intelligence (AI) models in medical imaging is paramount for reliable clinical integration. This international collaborative review paper provides an overview of key evaluation metrics across diverse tasks, including classification, regression, survival analysis, detection, and segmentation, as well as specialized metrics for calibration, foundation models, large language models, and synthetic images. Challenges of comparing models statistically and translating metric scores to clinical practice are also discussed. For each section, the paper outlines fundamental metrics, identifies common pitfalls and misapplications, and offers recommendations for more robust evaluations. Key recommendations often involve utilizing multiple, complementary metrics tailored to the specific task and dataset properties, transparent reporting of methodology, and critically, considering the clinical utility and real-world implications of model performance. Ultimately, effective evaluation requires a comprehensive, context-aware approach that goes beyond statistical metrics to ensure.
dc.format.extent24 p.
dc.format.mimetypeapplication/pdf
dc.identifier.idgrec766730
dc.identifier.urihttps://hdl.handle.net/2445/227851
dc.language.isoeng
dc.publisherElsevier B.V.
dc.relation.isformatofReproducció del document publicat a: https://doi.org/10.1016/j.ejrai.2025.100030
dc.relation.ispartofEuropean Journal of Radiology Artificial Intelligence, 2025, vol. 3, p. 100030
dc.relation.urihttps://doi.org/10.1016/j.ejrai.2025.100030
dc.rightscc-by (c) Burak Kocak et al., 2025
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.sourceArticles publicats en revistes (Matemàtiques i Informàtica)
dc.subject.classificationIntel·ligència artificial en medicina
dc.subject.classificationDiagnòstic per la imatge
dc.subject.classificationAprenentatge automàtic
dc.subject.classificationAlgorismes computacionals
dc.subject.otherMedical artificial intelligence
dc.subject.otherDiagnostic imaging
dc.subject.otherMachine learning
dc.subject.otherComputer algorithms
dc.titleEvaluation metrics in medical imaging AI: fundamentals, pitfalls, misapplications, and recommendations
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
922437.pdf
Mida:
13.73 MB
Format:
Adobe Portable Document Format