Biased accuracy in multisite machine-learning studies due to incomplete removal of the effects of the site

dc.contributor.authorSolanes, Aleix
dc.contributor.authorPalau, Pol
dc.contributor.authorFortea, Lydia
dc.contributor.authorSalvador, Raymond
dc.contributor.authorGonzález Navarro, Laura
dc.contributor.authorLlach, Cristian
dc.contributor.authorValentí Ribas, Marc
dc.contributor.authorVieta i Pascual, Eduard, 1963-
dc.contributor.authorRadua, Joaquim
dc.date.accessioned2025-03-20T13:53:02Z
dc.date.available2025-03-20T13:53:02Z
dc.date.issued2021-08-30
dc.date.updated2025-03-20T13:53:02Z
dc.description.abstractBrain MRI researchers conducting multisite studies, such as within the ENIGMA Consortium, are very aware of the importance of controlling the effects of the site (EoS) in the statistical analysis. Conversely, authors of the novel machine-learning MRI studies may remove the EoS when training the machine-learning models but not control them when estimating the models' accuracy, potentially leading to severely biased estimates. We show examples from a toy simulation study and real MRI data in which we remove the EoS from both the "training set" and the "test set" during the training and application of the model. However, the accuracy is still inflated (or occasionally shrunk) unless we further control the EoS during the estimation of the accuracy. We also provide several methods for controlling the EoS during the estimation of the accuracy, and a simple R package ("multisite.accuracy") that smoothly does this task for several accuracy estimates (e.g.,sensitivity/specificity, area under the curve, correlation, hazard ratio, etc.).
dc.format.extent21 p.
dc.format.mimetypeapplication/pdf
dc.identifier.idgrec717059
dc.identifier.idimarina9243705
dc.identifier.issn0925-4927
dc.identifier.pmid34098248
dc.identifier.urihttps://hdl.handle.net/2445/219882
dc.language.isoeng
dc.publisherElsevier B.V.
dc.relation.isformatofVersió postprint del document publicat a: https://doi.org/10.1016/j.pscychresns.2021.111313
dc.relation.ispartofPsychiatry Research-Neuroimaging, 2021, vol. 314
dc.relation.urihttps://doi.org/10.1016/j.pscychresns.2021.111313
dc.rightscc-by-nc-nd (c) Elsevier B.V., 2021
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.sourceArticles publicats en revistes (Medicina)
dc.subject.classificationAprenentatge automàtic
dc.subject.classificationEstadística mèdica
dc.subject.classificationImatges per ressonància magnètica
dc.subject.otherMachine learning
dc.subject.otherMedical statistics
dc.subject.otherMagnetic resonance imaging
dc.titleBiased accuracy in multisite machine-learning studies due to incomplete removal of the effects of the site
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/acceptedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
244296.pdf
Mida:
240.46 KB
Format:
Adobe Portable Document Format