End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data

Despite recent advances in 3-D pose estimation of human hands, thanks to the advent of convolutional neural networks (CNNs) and depth cameras, this task is still far from being solved in uncontrolled setups. This is mainly due to the highly non-linear dynamics of fingers and self-occlusions, which make hand model training a challenging task. In this study, a novel hierarchical tree-like structured CNN is exploited, in which branches are trained to become specialised in predefined subsets of hand joints called local poses. Further, local pose features, extracted from hierarchical CNN branches, are fused to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motions and deformations. Finally, a non-rigid data augmentation approach is introduced to increase the amount of training depth data. Experimental results suggest that feeding a tree-shaped CNN, specialised in local poses, into a fusion network for modelling joints' correlations and dependencies, helps to increase the precision of final estimations, showing competitive results on NYU, MSRA, Hands17 and SyntheticHand datasets.

Matèries

Visió per ordinador, Interacció persona-ordinador, Xarxes neuronals convolucionals

Matèries (anglès)

Computer vision, Human-computer interaction, Convolutional neural networks

Col·leccions

Articles publicats en revistes (Matemàtiques i Informàtica)

Pàgina completa de l'ítem

Citació

MADADI, Meysam, ESCALERA GUERRERO, Sergio, BARÓ I SOLÉ, Xavier, GONZÀLEZ SABATÉ, Jordi. End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data. _IET Computer Vision_. 2021. Vol. 16, núm. 1, pàgs. 50-66. [consulta: 25 de febrer de 2026]. ISSN: 1751-9632. [Disponible a: https://hdl.handle.net/2445/190703]

Estadístiques

Exportar metadades

JSON - METS

Fitxers

Tipus de document

Versió

Data de publicació

Llicència de publicació

End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data

Títol de la revista

Autors

Director/Tutor

ISSN de la revista

Títol del volum

Recurs relacionat

Resum

Matèries

Matèries (anglès)

Citació

Col·leccions

Citació

Exportar metadades

Fitxers

Tipus de document

Versió

Data de publicació

Llicència de publicació

End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data

Títol de la revista

Autors

Director/Tutor

ISSN de la revista

Títol del volum

Recurs relacionat

Resum

Matèries

Matèries (anglès)

Citació

Col·leccions

Citació

Exportar metadades

Compartir registre