Please use this identifier to cite or link to this item:
http://hdl.handle.net/2445/200060
Title: | Sampling methods for activation correlation graphs to predict neural network generalization using topological data analysis |
Author: | Carpay, Otis |
Director/Tutor: | Escalera Guerrero, Sergio Ballester Bautista, Rubén |
Keywords: | Xarxes neuronals (Informàtica) Aprenentatge automàtic Processament de dades Treballs de fi de màster Topologia algebraica Neural networks (Computer science) Machine learning Data processing Master's theses Algebraic topology |
Issue Date: | 15-Jan-2023 |
Abstract: | [en] The performance of a deep neural network (DNN) is dependent on its ability to generalize. This ability is often expressed in the difference in accuracy on a training and test set, or the generalization gap. Recent research has seen the use of topological data analysis to estimate this performance gap without the use of a test set. Here, persistent homology measures are derived from a weighted graph of neuron activation correlations (functional network graph). The resulting persistence diagram is vectorized by a number of statistical summaries and correlated with the generalization gap. However, the computational complexity of persistent homology calculations hinders the application to DNNs with a larger number of activations. Methods are needed to sample these activations without losing predictive power. This work assesses the effect of different sampling strategies on the resulting persistence diagrams and their summaries. These include (non-)stratified random sampling, three methods based on notions of neuron importance similar to those used in pruning, and one using $k$-means++. In line with previous research some of these strategies provide models for predicting the generalization gap with high accuracy. The investigations provide insight and open up new lines of research into the structure of the functional network activation graph. |
Note: | Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Sergio Escalera Guerrero i Rubén Ballester Bautista |
URI: | http://hdl.handle.net/2445/200060 |
Appears in Collections: | Programari - Treballs de l'alumnat Màster Oficial - Fonaments de la Ciència de Dades |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
TFM_TDA_sampling-10.pdf | Memòria | 744.47 kB | Adobe PDF | View/Open |
tfm-master.zip | Codi font | 105.57 MB | zip | View/Open |
This item is licensed under a Creative Commons License