End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data

dc.contributor.authorMadadi, Meysam
dc.contributor.authorEscalera Guerrero, Sergio
dc.contributor.authorBaró i Solé, Xavier
dc.contributor.authorGonzàlez Sabaté, Jordi
dc.date.accessioned2022-11-11T07:42:10Z
dc.date.available2022-11-11T07:42:10Z
dc.date.issued2021-08-12
dc.date.updated2022-11-11T07:42:11Z
dc.description.abstractDespite recent advances in 3-D pose estimation of human hands, thanks to the advent of convolutional neural networks (CNNs) and depth cameras, this task is still far from being solved in uncontrolled setups. This is mainly due to the highly non-linear dynamics of fingers and self-occlusions, which make hand model training a challenging task. In this study, a novel hierarchical tree-like structured CNN is exploited, in which branches are trained to become specialised in predefined subsets of hand joints called local poses. Further, local pose features, extracted from hierarchical CNN branches, are fused to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motions and deformations. Finally, a non-rigid data augmentation approach is introduced to increase the amount of training depth data. Experimental results suggest that feeding a tree-shaped CNN, specialised in local poses, into a fusion network for modelling joints' correlations and dependencies, helps to increase the precision of final estimations, showing competitive results on NYU, MSRA, Hands17 and SyntheticHand datasets.
dc.format.extent17 p.
dc.format.mimetypeapplication/pdf
dc.identifier.idgrec713061
dc.identifier.issn1751-9632
dc.identifier.urihttps://hdl.handle.net/2445/190703
dc.language.isoeng
dc.publisherJohn Wiley & Sons
dc.relation.isformatofReproducció del document publicat a: https://doi.org/10.1049/cvi2.12064
dc.relation.ispartofIET Computer Vision, 2021, vol. 16, num. 1, p. 50-66
dc.relation.urihttps://doi.org/10.1049/cvi2.12064
dc.rightscc-by-nc (c) Madadi, Meysam et al., 2021
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by-nc/3.0/es/*
dc.sourceArticles publicats en revistes (Matemàtiques i Informàtica)
dc.subject.classificationVisió per ordinador
dc.subject.classificationInteracció persona-ordinador
dc.subject.classificationXarxes neuronals convolucionals
dc.subject.otherComputer vision
dc.subject.otherHuman-computer interaction
dc.subject.otherConvolutional neural networks
dc.titleEnd-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion

Fitxers

Paquet original

Mostrant 1 - 1 de 1
Carregant...
Miniatura
Nom:
713061.pdf
Mida:
2.97 MB
Format:
Adobe Portable Document Format