Please use this identifier to cite or link to this item: http://hdl.handle.net/2445/200060
Title: Sampling methods for activation correlation graphs to predict neural network generalization using topological data analysis
Author: Carpay, Otis
Director/Tutor: Escalera Guerrero, Sergio
Ballester Bautista, Rubén
Keywords: Xarxes neuronals (Informàtica)
Aprenentatge automàtic
Processament de dades
Treballs de fi de màster
Topologia algebraica
Neural networks (Computer science)
Machine learning
Data processing
Master's theses
Algebraic topology
Issue Date: 15-Jan-2023
Abstract: [en] The performance of a deep neural network (DNN) is dependent on its ability to generalize. This ability is often expressed in the difference in accuracy on a training and test set, or the generalization gap. Recent research has seen the use of topological data analysis to estimate this performance gap without the use of a test set. Here, persistent homology measures are derived from a weighted graph of neuron activation correlations (functional network graph). The resulting persistence diagram is vectorized by a number of statistical summaries and correlated with the generalization gap. However, the computational complexity of persistent homology calculations hinders the application to DNNs with a larger number of activations. Methods are needed to sample these activations without losing predictive power. This work assesses the effect of different sampling strategies on the resulting persistence diagrams and their summaries. These include (non-)stratified random sampling, three methods based on notions of neuron importance similar to those used in pruning, and one using $k$-means++. In line with previous research some of these strategies provide models for predicting the generalization gap with high accuracy. The investigations provide insight and open up new lines of research into the structure of the functional network activation graph.
Note: Treballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Sergio Escalera Guerrero i Rubén Ballester Bautista
URI: http://hdl.handle.net/2445/200060
Appears in Collections:Màster Oficial - Fonaments de la Ciència de Dades
Programari - Treballs de l'alumnat

Files in This Item:
File Description SizeFormat 
TFM_TDA_sampling-10.pdfMemòria744.47 kBAdobe PDFView/Open
tfm-master.zipCodi font105.57 MBzipView/Open


This item is licensed under a Creative Commons License Creative Commons