Abstract: [eng] Facial expressions are vital ways of communication between humans in social contexts. They are used as conversational markers and they convey information about affective and cognitive state. Many applications would benefit from the advance of automatic facial expression recognition (AFER). Robust AFER would improve human-computer interaction, it would increase driving safety, would help medical personal to better take care of patients with impaired communication ability and could transform online education. In recent years significant advancement has been undertaken in AFER with the use of deep neural networks (DNN). Unfortunately this increase in performance came with increased opacity. The current status of DNNs as "black-box" model hinders the advancement of the field. In this dissertation, we propose a new general framework for analysing deep neural networks based on the systematic study of their topology while they are learning patterns in the data. We use this framework to study a newly proposed DNN, specially built for Action Unit recognition which results in better understanding, control and increased performance. In summary, this dissertation has the following main contributions: a) Definition of comprehensive taxonomy of automatic computer vision approaches to automatic facial expression recognition followed by an extended review of historical and current trends in AFER. b) Proposal of a model that learns representation, patch and output structure of the face end-to-end e) Introduction of a structure inference topology that replicates inference algorithm in probabilistic graphical models by using a recurrent neural network c) Extended ablation study and experimental analysis of the newly proposed architecture d) Analysis and improving performance of the previously proposed architecture for facial expression architecture using the new theoretical framework. e) Formulation of novel general framework for analysis of deep neural networks based on algebraic topology f) Analysis of fundamental topological differences between DNNs that learn and DNNs that memorize g) Demonstrating the use of newly proposed analytical framework on facial action unit recognition using DSIN.
