Document type
ArticleVersion
Accepted versionPublication date
All rights reserved
Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/174612
Penalized logistic regression to improve predictive capacity of rare events in surveys
Journal Title
Director/Tutor
Journal ISSN
Volume Title
Related resource
Abstract
Logistic regression as a modelling technique of rare binary dependent variables with much fewer events (ones) than non-events (zeros) tends to underestimate their probability of occurrence. The vast literature devoted to the prediction of rare binary data identifies several ways to improve predictive performance by making modifications to the likelihood estimation. We propose two weighting mechanisms for incorporation in a pseudo-likelihood estimation that improve the predictive capacity of rare binary responses in data collected in complex surveys. We multiply sampling weights by specific correctors that lead to lower root mean square errors for event observations in almost all deciles. A case study is discussed where this method is implemented to predict the probability of suffering a workplace accident in a logistic regression model that is estimated with data from a survey conducted in Ecuador.
Subject (English)
Citation
Citation
PESANTEZ-NARVAEZ, Jessica and GUILLÉN, Montserrat. Penalized logistic regression to improve predictive capacity of rare events in surveys. Journal of Intelligent and Fuzzy Systems. 2020. Vol. 38, num. 5, pags. 5497-5507. ISSN 1064-1246. [consulted: 9 of June of 2026]. Available at: https://hdl.handle.net/2445/174612