Please use this identifier to cite or link to this item: https://hdl.handle.net/2445/189541
Title: On the use of the descriptive variable for enhancing the aggregation of crowdsourced labels
Author: Beñaran-Muñoz, Iker
Hernández-González, Jerónimo
Pérez, Aritz
Keywords: Aprenentatge automàtic
Cultura participativa
Dades massives
Machine learning
Participatory culture
Big data
Issue Date: 30-Sep-2022
Publisher: Springer Verlag
Abstract: The use of crowdsourcing for annotating data has become a popular and cheap alternative to expert labelling. As a consequence, an aggregation task is required to combine the different labels provided and agree on a single one per example. Most aggregation techniques, including the simple and robust majority voting¿to select the label with the largest number of votes¿disregard the descriptive information provided by the explanatory variable. In this paper, we propose domain-aware voting, an extension of majority voting which incorporates the descriptive variable and the rest of the instances of the dataset for aggregating the label of every instance. The experimental results with simulated and real-world crowdsourced data suggest that domain-aware voting is a competitive alternative to majority voting, especially when a part of the dataset is unlabelled. We elaborate on practical criteria for the use of domain-aware voting.
Note: Reproducció del document publicat a: https://doi.org/10.1007/s10115-022-01743-z
It is part of: Knowledge and Information Systems, 2022
URI: https://hdl.handle.net/2445/189541
Related resource: https://doi.org/10.1007/s10115-022-01743-z
ISSN: 0219-1377
Appears in Collections:Articles publicats en revistes (Matemàtiques i Informàtica)

Files in This Item:
File Description SizeFormat 
725388.pdf1.75 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons