Sampling methods for activation correlation graphs to predict neural network generalization using topological data analysis

dc.contributor.advisorEscalera Guerrero, Sergio
dc.contributor.advisorBallester Bautista, Rubén
dc.contributor.authorCarpay, Otis
dc.date.accessioned2023-06-29T06:59:24Z
dc.date.available2023-06-29T06:59:24Z
dc.date.issued2023-01-15
dc.descriptionTreballs finals del Màster de Fonaments de Ciència de Dades, Facultat de matemàtiques, Universitat de Barcelona. Curs: 2022-2023. Tutor: Sergio Escalera Guerrero i Rubén Ballester Bautistaca
dc.description.abstract[en] The performance of a deep neural network (DNN) is dependent on its ability to generalize. This ability is often expressed in the difference in accuracy on a training and test set, or the generalization gap. Recent research has seen the use of topological data analysis to estimate this performance gap without the use of a test set. Here, persistent homology measures are derived from a weighted graph of neuron activation correlations (functional network graph). The resulting persistence diagram is vectorized by a number of statistical summaries and correlated with the generalization gap. However, the computational complexity of persistent homology calculations hinders the application to DNNs with a larger number of activations. Methods are needed to sample these activations without losing predictive power. This work assesses the effect of different sampling strategies on the resulting persistence diagrams and their summaries. These include (non-)stratified random sampling, three methods based on notions of neuron importance similar to those used in pruning, and one using $k$-means++. In line with previous research some of these strategies provide models for predicting the generalization gap with high accuracy. The investigations provide insight and open up new lines of research into the structure of the functional network activation graph.ca
dc.format.extent41 p.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2445/200060
dc.language.isoengca
dc.rightscc-by-nc-nd (c) Otis Carpay, 2023
dc.rightscodi: GPL (c) Otis Carpay, 2023
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca
dc.rights.urihttp://www.gnu.org/licenses/gpl-3.0.ca.html*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.sourceMàster Oficial - Fonaments de la Ciència de Dades
dc.subject.classificationXarxes neuronals (Informàtica)
dc.subject.classificationAprenentatge automàtic
dc.subject.classificationProcessament de dades
dc.subject.classificationTreballs de fi de màster
dc.subject.classificationTopologia algebraicaca
dc.subject.otherNeural networks (Computer science)
dc.subject.otherMachine learning
dc.subject.otherData processing
dc.subject.otherMaster's theses
dc.subject.otherAlgebraic topologyen
dc.titleSampling methods for activation correlation graphs to predict neural network generalization using topological data analysisca
dc.typeinfo:eu-repo/semantics/masterThesisca

Fitxers

Paquet original

Mostrant 1 - 2 de 2
Carregant...
Miniatura
Nom:
TFM_TDA_sampling-10.pdf
Mida:
744.47 KB
Format:
Adobe Portable Document Format
Descripció:
Memòria
Carregant...
Miniatura
Nom:
tfm-master.zip
Mida:
103.09 MB
Format:
ZIP file
Descripció:
Codi font